Plan: Project-Specific E2E Browser Testing with agent-browser

Summary

Create a single project-level slash command .claude/commands/test-e2e.md that uses agent-browser CLI to intelligently test dashboard pages affected by the current PR/branch. It auto-starts the dev server if needed, authenticates as the test user, maps changed files to routes, and tests each affected page.

Commits

feat: add /test-e2e command for agent-browser E2E testing

File to Create

.claude/commands/test-e2e.md — Project-level slash command (~250 lines)

Usage: /test-e2e, /test-e2e 847 (PR), /test-e2e feature/my-branch

Command Structure

Phase 1: Prerequisites

Verify agent-browser CLI is installed
Ask user: headed vs headless mode

Phase 2: Dev Server Auto-Start

Health check: curl -sf http://localhost:3101 >/dev/null and curl -sf http://localhost:4001/health >/dev/null
If not running: start via bun obvious up in background, poll health every 5s for up to 120s
If already running: proceed immediately

Phase 3: Determine Test Scope

PR number arg: gh pr view {number} --json files -q '.files[].path'
Branch arg: git diff --name-only main...{branch}
No arg (current): git diff --name-only main...HEAD

Phase 4: File-to-Route Mapping (Obvious-Specific)

Three tiers of mapping, derived from dashboard/src/pages/app.tsx:

Tier 1 — Direct page files → routes (highest confidence)

File Pattern	Route
`dashboard/src/pages/login-page.tsx`	`/login`
`dashboard/src/pages/landing-page.tsx`	`/landing`
`dashboard/src/pages/onboarding-page.tsx`	`/onboarding`
`dashboard/src/pages/projects-library-page.tsx`	`/projects`
`dashboard/src/pages/project-page/**`	`/projects/:id`
`dashboard/src/pages/shortcuts-page/**`	`/shortcuts`
`dashboard/src/pages/templates-page/**`	`/templates`
`dashboard/src/pages/artifact-library-page.tsx`	`/library`
`dashboard/src/pages/tasks-page/**`	`/tasks`
`dashboard/src/pages/notes-page/**`	`/notes`
`dashboard/src/pages/skills-page/**`	`/skills`
`dashboard/src/pages/custom-modes-page/**`	`/modes`
`dashboard/src/pages/settings/*`	`/settings` (+ specific sub-route)
`dashboard/src/pages/workspace-usage-page/**`	`/settings/workspace/usage`

Tier 2 — Feature directories → likely routes

Feature Directory	Routes
`dashboard/src/features/chat/**`	`/projects/:id`
`dashboard/src/features/main-sidebar/**`	`/landing`
`dashboard/src/features/prompt-input/**`	`/landing`
`dashboard/src/features/projects-grid/**`	`/projects`, `/landing`
`dashboard/src/features/artifact-/*`	`/library`, `/projects/:id`
`dashboard/src/features/settings-layout/**`	`/settings`
`dashboard/src/features/notes/**`	`/notes`
`dashboard/src/features/onboarding/**`	`/onboarding`
`dashboard/src/features/landing-/*`	`/landing`
`dashboard/src/features/billing-/*`	`/settings/workspace/billing`

Tier 3 — Broad-impact changes → test core pages

Pattern	Routes
`dashboard/src/components/**`	`/landing`, `/projects`
`dashboard/src/ui/**`	`/landing`
`packages/**`	`/landing`, `/login`
`apps/api/src/routes/**`	Corresponding dashboard page

Deduplicate routes across tiers. Cap at 8 routes max. Skip non-dashboard files (CI, docs, scripts).

Phase 5: Authentication via agent-browser

agent-browser open http://localhost:3101/login
agent-browser snapshot -i → get refs for email input (placeholder="Enter your email"), password input (placeholder="Enter your password"), submit button ("Sign In")
agent-browser fill @emailRef "test@example.com"
agent-browser fill @passwordRef "password"
agent-browser click @submitRef
agent-browser wait 3000 → login has a 2.3s animation before redirect
Verify landed on /landing or /onboarding
If /onboarding: handle form or wait for auto-redirect, then navigate to /landing
Take confirmation screenshot

Session persists across all subsequent agent-browser open calls — login once, test many.

Phase 6: Test Each Route

For each mapped route:

agent-browser open http://localhost:3101{route}
agent-browser wait 2000 (let SSE hydration settle)
agent-browser snapshot -i → verify expected elements present
Check for error boundaries ("Something went wrong") or blank pages
Perform one basic interaction (click a tab, expand a section) if applicable
agent-browser screenshot {route-name}.png
Record PASS/FAIL

For /projects/:id routes: first create a project via /landing prompt input or navigate to an existing one.

Phase 7: Failure Handling

On fail: take screenshot + full snapshot, ask user to fix/skip/stop
On timeout: mark as FAIL, continue

Phase 8: Summary Report

Generate markdown table with route, status, notes. List unmapped files. Overall PASS/FAIL/PARTIAL.

Key Design Decisions

Single .claude/commands/test-e2e.md file — no code files, no configs. The command is a prompt that tells Claude how to test.
Named test-e2e not test-browser — avoids collision with the generic plugin command; both remain available.
Route map embedded in the command — derived from app.tsx, small enough to inline. Update when routes change.
Auth via UI login, not cookie injection — agent-browser doesn't support cookie injection; login through the form also tests the auth flow.
Dev server auto-start — uses bun obvious up with health polling, matching the project's existing dev tooling.
Breadth over depth — verify pages load and have correct elements rather than deep workflow testing (Playwright handles that).

Critical Source Files

dashboard/src/pages/app.tsx — Route definitions (the route map must stay in sync)
dashboard/src/pages/login-page.tsx — Login form structure (placeholders, button text)
playwright/utils/test-helpers.ts — Reference for auth + onboarding patterns
playwright/AGENTS.md — Testing patterns and anti-patterns

Verification

Run /test-e2e with the dev server running — should analyze current branch, map files, authenticate, test routes
Run /test-e2e with server stopped — should detect and offer to start it
Run /test-e2e 123 with a PR number — should fetch PR files and test affected routes
Verify headed mode shows the browser window: /test-e2e → select "Headed"

matthew-gerstman/melodic-baking-deer.md

Select an option

No results found