agent-browser
Browser automation and visual testing using agent-browser CLI. Use this skill for E2E testing, taking screenshots, interacting with web pages, and visual regression testing.
SKILL.md
| Name | agent-browser |
| Description | Browser automation and visual testing using agent-browser CLI. Use this skill for E2E testing, taking screenshots, interacting with web pages, and visual regression testing. |
name: agent-browser description: Browser automation and visual testing using agent-browser CLI. Use this skill for E2E testing, taking screenshots, interacting with web pages, and visual regression testing.
Agent-Browser Skill
Headless browser automation CLI for visual testing, E2E testing, and web interactions.
Prerequisites
Installed globally via npm install -g agent-browser with Chromium downloaded via agent-browser install.
Core Commands
Navigation & Screenshots
# Open a URL
agent-browser open "https://example.com"
# Take screenshot
agent-browser screenshot ./screenshots/page.png
# Save as PDF
agent-browser pdf ./output.pdf
# Get accessibility tree (best for AI understanding)
agent-browser snapshot
Interactions
# Click element (use @ref from snapshot, or CSS selector)
agent-browser click "@ref123"
agent-browser click "button.submit"
# Type into input
agent-browser type "input[name=email]" "user@example.com"
# Fill (clears first)
agent-browser fill "#search" "query text"
# Select dropdown
agent-browser select "#country" "US"
# Check/uncheck
agent-browser check "#terms"
agent-browser uncheck "#newsletter"
# Wait for element or time
agent-browser wait "#loading"
agent-browser wait 2000
Get Information
agent-browser get text ".message"
agent-browser get html "#content"
agent-browser get value "input#email"
agent-browser get attr "href" "a.link"
agent-browser get title
agent-browser get url
Check State
agent-browser is visible ".modal"
agent-browser is enabled "button.submit"
agent-browser is checked "#remember-me"
Visual Testing Workflow
-
Capture baseline screenshots before changes:
agent-browser open "http://localhost:3000" agent-browser screenshot ./baseline/home.png -
After changes, capture new screenshots:
agent-browser open "http://localhost:3000" agent-browser screenshot ./current/home.png -
Compare visually or use image diff tools
Test User Credentials
Two test users are seeded in the database for E2E testing. Use these to access protected routes:
| Role | Password | |
|---|---|---|
| Admin | admin@test.com | testpassword123 |
| User | user@test.com | testpassword123 |
Note: Run npx tsx prisma/seed.ts to create these users if they don't exist.
Authentication Flow
Most pages require authentication. Login before accessing protected routes:
# Login as regular user
agent-browser open "http://localhost:3000/sign-in"
agent-browser fill "#email" "user@test.com"
agent-browser fill "#password" "testpassword123"
agent-browser click "button[type='submit']"
agent-browser wait 3000
# Now access protected pages (session cookie persists)
agent-browser open "http://localhost:3000/profile"
agent-browser screenshot ./screenshots/profile.png
# Login as admin (for /admin/* routes)
agent-browser open "http://localhost:3000/sign-in"
agent-browser fill "#email" "admin@test.com"
agent-browser fill "#password" "testpassword123"
agent-browser click "button[type='submit']"
agent-browser wait 3000
# Access admin pages
agent-browser open "http://localhost:3000/admin"
agent-browser screenshot ./screenshots/admin-dashboard.png
Important: Use the same --session flag across all commands to maintain the login session:
agent-browser open "http://localhost:3000/sign-in" --session "my-test"
agent-browser fill "#email" "user@test.com" --session "my-test"
# ... continue with same session
E2E Test Example
# Login flow test
agent-browser open "http://localhost:3000/sign-in"
agent-browser fill "#email" "user@test.com"
agent-browser fill "#password" "testpassword123"
agent-browser click "button[type='submit']"
agent-browser wait 3000
agent-browser screenshot ./tests/login-success.png
E2E Test Mode for Assessment Pages
Assessment pages (/assessments/[id]/welcome, /work, /results) require screen recording, which doesn't work in headless browsers.
To bypass screen recording for E2E tests, start the dev server with:
E2E_TEST_MODE=true NEXT_PUBLIC_E2E_TEST_MODE=true npm run dev
Then run your tests against that server. The screen recording modal will be skipped and fake recording segments will be created in the database.
Note: This only works in development mode (NODE_ENV=development). The bypass is disabled in production.
Test Assessment IDs
After running npx tsx prisma/seed.ts, these test assessments are available:
| ID | Status | Page URL |
|---|---|---|
| test-assessment-chat | WORKING | /assessments/test-assessment-chat/work |
Note: The test assessment is owned by user@test.com, so you must login as that user first.
Taking Screenshots of Chat/Sidebar Components
Complete workflow for visual verification of chat layout components:
# 1. Ensure database is seeded
npx tsx prisma/seed.ts
# 2. Start dev server with E2E mode (if not already running)
E2E_TEST_MODE=true NEXT_PUBLIC_E2E_TEST_MODE=true npm run dev &
sleep 10
# 3. Login as test user
agent-browser open "http://localhost:3000/sign-in" --session "visual-test"
agent-browser fill "#email" "user@test.com" --session "visual-test"
agent-browser fill "#password" "testpassword123" --session "visual-test"
agent-browser click "button[type='submit']" --session "visual-test"
agent-browser wait 3000 --session "visual-test"
# 4. Navigate to work page (uses fixed test assessment ID)
agent-browser open "http://localhost:3000/assessments/test-assessment-chat/work" --session "visual-test"
agent-browser wait 2000 --session "visual-test"
# 5. Take screenshot
mkdir -p screenshots
agent-browser screenshot ./screenshots/chat-page.png --session "visual-test"
This captures: SlackLayout sidebar, CoworkerSidebar, chat interface, and FloatingCallBar (if in call).
Tips
- Use
snapshotto get accessibility tree with element references (@ref) - more reliable than CSS selectors - Screenshots help verify visual state during automated tests
- Chain commands in shell scripts for full test flows
- Use
--sessionflag for isolated browser instances