ui-screenshots
Take UI screenshots using agent-browser. Use this skill to capture visual state of UI components for code review, visual regression testing, or documentation.
SKILL.md
| Name | ui-screenshots |
| Description | Take UI screenshots using agent-browser. Use this skill to capture visual state of UI components for code review, visual regression testing, or documentation. |
name: ui-screenshots description: Take UI screenshots using agent-browser. Use this skill to capture visual state of UI components for code review, visual regression testing, or documentation.
UI Screenshots Skill
Capture UI screenshots for visual verification.
Uses agent-browser for screenshots (fast Rust CLI with Node.js fallback).
Prerequisites
-
agent-browser: Install the browser automation CLI:
npm install -g agent-browser agent-browser install # Download ChromiumFor Linux with missing dependencies:
agent-browser install --with-deps -
Cloudinary Account (free tier available):
- Create account at cloudinary.com
- Get your
CLOUDINARY_URLfrom the dashboard (format:cloudinary://API_KEY:API_SECRET@CLOUD_NAME) - Set environment variable:
CLOUDINARY_URL
-
GitHub Token:
GITHUB_TOKENenvironment variable for PR comments.
Usage
Taking Screenshots
Use the take-screenshot script:
.claude/skills/ui-screenshots/scripts/take-screenshot.sh [URL] [OUTPUT_PATH]
Example:
.claude/skills/ui-screenshots/scripts/take-screenshot.sh \
http://localhost:9300/dev/components \
screenshot.png
Using agent-browser directly
For more control, use agent-browser CLI commands:
# Open a page
agent-browser open http://localhost:9300/dev/components
# Take full-page screenshot
agent-browser screenshot screenshot.png --full
# Get accessibility snapshot (useful for AI agents)
agent-browser snapshot -i -c
Attaching Screenshots to PR
Screenshots are NOT committed to the repo. They are uploaded to Cloudinary and embedded in PR comments:
.claude/skills/ui-screenshots/scripts/upload-screenshot.sh \
screenshot.png \
195 # PR number
Integration with Smoke Tests
The smoke test skill can call this skill to capture UI state:
- Run screenshot capture as part of smoke testing
- If a PR branch is detected, upload screenshots and add PR comment
- Screenshots help reviewers verify visual changes
Troubleshooting
agent-browser not found
Install globally:
npm install -g agent-browser
agent-browser install
Missing browser dependencies on Linux
agent-browser install --with-deps
# Or manually:
npx playwright install-deps chromium
Browser version mismatch
If agent-browser reports a missing browser version (e.g., chromium_headless_shell-1208) but you have a similar version installed (e.g., chromium_headless_shell-1200), and network issues prevent downloading the correct version:
# Check available versions
ls /root/.cache/ms-playwright/
# Create symlinks to use a nearby version
cd /root/.cache/ms-playwright
ln -s chromium_headless_shell-1200 chromium_headless_shell-1208
ln -s chromium-1200 chromium-1208
Note: This is a workaround when storage.googleapis.com is unreachable. Minor version differences (1200 vs 1208) are usually compatible.
Page hangs on localhost
The dev server may not be running. Start it first:
cd apps/ui && npm run dev &
sleep 10 # Wait for server
Session conflicts
agent-browser uses sessions to isolate browser instances:
# Use a named session
agent-browser --session screenshots open http://localhost:9300
# List sessions
agent-browser session list
Cloudinary upload fails
Verify CLOUDINARY_URL is set correctly:
- Format:
cloudinary://API_KEY:API_SECRET@CLOUD_NAME - Get from Cloudinary dashboard
Script Reference
take-screenshot.sh
Take a screenshot of a URL using agent-browser:
.claude/skills/ui-screenshots/scripts/take-screenshot.sh [URL] [OUTPUT_PATH]
Example:
.claude/skills/ui-screenshots/scripts/take-screenshot.sh \
http://localhost:9300/dev/components \
/tmp/screenshot.png
upload-screenshot.sh
Upload screenshot to Cloudinary and add PR comment:
.claude/skills/ui-screenshots/scripts/upload-screenshot.sh <SCREENSHOT_PATH> <PR_NUMBER> [DESCRIPTION]
Example:
.claude/skills/ui-screenshots/scripts/upload-screenshot.sh \
/tmp/screenshot.png \
195 \
"Dev components page showing message and tool call rendering"
check-config.sh
Check if cloud agent environment is configured:
.claude/skills/ui-screenshots/scripts/check-config.sh
Verifies: GITHUB_TOKEN, CLOUDINARY_URL, and agent-browser availability.
agent-browser Commands Reference
Common commands for screenshot workflows:
| Command | Description |
|---|---|
agent-browser open <url> | Navigate to a URL |
agent-browser screenshot [path] | Capture screenshot |
agent-browser screenshot path --full | Full page screenshot |
agent-browser snapshot | Get accessibility tree (for AI) |
agent-browser snapshot -i -c | Interactive elements only, compact |
agent-browser click <selector> | Click element |
agent-browser wait <selector> | Wait for element |
agent-browser scroll down | Scroll page |
agent-browser --json ... | JSON output for automation |
See agent-browser docs for full reference.