name: ui-screenshots description: Take UI screenshots using agent-browser. Use this skill to capture visual state of UI components for code review, visual regression testing, or documentation.

UI Screenshots Skill

Capture UI screenshots for visual verification.

Uses agent-browser for screenshots (fast Rust CLI with Node.js fallback).

Prerequisites

agent-browser: Install the browser automation CLI:

npm install -g agent-browser
agent-browser install  # Download Chromium

For Linux with missing dependencies:

agent-browser install --with-deps

Cloudinary Account (free tier available):
- Create account at cloudinary.com
- Get your CLOUDINARY_URL from the dashboard (format: cloudinary://API_KEY:API_SECRET@CLOUD_NAME)
- Set environment variable: CLOUDINARY_URL
GitHub Token: GITHUB_TOKEN environment variable for PR comments.

Usage

Taking Screenshots

Use the take-screenshot script:

.claude/skills/ui-screenshots/scripts/take-screenshot.sh [URL] [OUTPUT_PATH]

Example:

.claude/skills/ui-screenshots/scripts/take-screenshot.sh \
  http://localhost:9300/dev/components \
  screenshot.png

Using agent-browser directly

For more control, use agent-browser CLI commands:

# Open a page
agent-browser open http://localhost:9300/dev/components

# Take full-page screenshot
agent-browser screenshot screenshot.png --full

# Get accessibility snapshot (useful for AI agents)
agent-browser snapshot -i -c

Attaching Screenshots to PR

Screenshots are NOT committed to the repo. They are uploaded to Cloudinary and embedded in PR comments:

.claude/skills/ui-screenshots/scripts/upload-screenshot.sh \
  screenshot.png \
  195  # PR number

Integration with Smoke Tests

The smoke test skill can call this skill to capture UI state:

Run screenshot capture as part of smoke testing
If a PR branch is detected, upload screenshots and add PR comment
Screenshots help reviewers verify visual changes

Troubleshooting

agent-browser not found

Install globally:

npm install -g agent-browser
agent-browser install

Missing browser dependencies on Linux

agent-browser install --with-deps
# Or manually:
npx playwright install-deps chromium

Browser version mismatch

If agent-browser reports a missing browser version (e.g., chromium_headless_shell-1208) but you have a similar version installed (e.g., chromium_headless_shell-1200), and network issues prevent downloading the correct version:

# Check available versions
ls /root/.cache/ms-playwright/

# Create symlinks to use a nearby version
cd /root/.cache/ms-playwright
ln -s chromium_headless_shell-1200 chromium_headless_shell-1208
ln -s chromium-1200 chromium-1208

Note: This is a workaround when storage.googleapis.com is unreachable. Minor version differences (1200 vs 1208) are usually compatible.

Page hangs on localhost

The dev server may not be running. Start it first:

cd apps/ui && npm run dev &
sleep 10  # Wait for server

Session conflicts

agent-browser uses sessions to isolate browser instances:

# Use a named session
agent-browser --session screenshots open http://localhost:9300

# List sessions
agent-browser session list

Cloudinary upload fails

Verify CLOUDINARY_URL is set correctly:

Format: cloudinary://API_KEY:API_SECRET@CLOUD_NAME
Get from Cloudinary dashboard

Script Reference

take-screenshot.sh

Take a screenshot of a URL using agent-browser:

.claude/skills/ui-screenshots/scripts/take-screenshot.sh [URL] [OUTPUT_PATH]

Example:

.claude/skills/ui-screenshots/scripts/take-screenshot.sh \
  http://localhost:9300/dev/components \
  /tmp/screenshot.png

upload-screenshot.sh

Upload screenshot to Cloudinary and add PR comment:

.claude/skills/ui-screenshots/scripts/upload-screenshot.sh <SCREENSHOT_PATH> <PR_NUMBER> [DESCRIPTION]

Example:

.claude/skills/ui-screenshots/scripts/upload-screenshot.sh \
  /tmp/screenshot.png \
  195 \
  "Dev components page showing message and tool call rendering"

check-config.sh

Check if cloud agent environment is configured:

.claude/skills/ui-screenshots/scripts/check-config.sh

Verifies: GITHUB_TOKEN, CLOUDINARY_URL, and agent-browser availability.

agent-browser Commands Reference

Common commands for screenshot workflows:

Command	Description
`agent-browser open <url>`	Navigate to a URL
`agent-browser screenshot [path]`	Capture screenshot
`agent-browser screenshot path --full`	Full page screenshot
`agent-browser snapshot`	Get accessibility tree (for AI)
`agent-browser snapshot -i -c`	Interactive elements only, compact
`agent-browser click <selector>`	Click element
`agent-browser wait <selector>`	Wait for element
`agent-browser scroll down`	Scroll page
`agent-browser --json ...`	JSON output for automation

See agent-browser docs for full reference.

Name	ui-screenshots
Description	Take UI screenshots using agent-browser. Use this skill to capture visual state of UI components for code review, visual regression testing, or documentation.

ui-screenshots

SKILL.md

name: ui-screenshots description: Take UI screenshots using agent-browser. Use this skill to capture visual state of UI components for code review, visual regression testing, or documentation.

UI Screenshots Skill

Prerequisites

Usage

Taking Screenshots

Using agent-browser directly

Attaching Screenshots to PR

Integration with Smoke Tests

Troubleshooting

agent-browser not found

Missing browser dependencies on Linux

Browser version mismatch

Page hangs on localhost

Session conflicts

Cloudinary upload fails

Script Reference

take-screenshot.sh

upload-screenshot.sh

check-config.sh

agent-browser Commands Reference