Skip to content

Instantly share code, notes, and snippets.

@matthew-gerstman
Created February 9, 2026 01:03
Show Gist options
  • Select an option

  • Save matthew-gerstman/ac2d098988baa450e595fe3ccf2bad03 to your computer and use it in GitHub Desktop.

Select an option

Save matthew-gerstman/ac2d098988baa450e595fe3ccf2bad03 to your computer and use it in GitHub Desktop.
Plan: Project-Specific E2E Browser Testing with agent-browser

Plan: Project-Specific E2E Browser Testing with agent-browser

Summary

Create a single project-level slash command .claude/commands/test-e2e.md that uses agent-browser CLI to intelligently test dashboard pages affected by the current PR/branch. It auto-starts the dev server if needed, authenticates as the test user, maps changed files to routes, and tests each affected page.

Commits

  1. feat: add /test-e2e command for agent-browser E2E testing

File to Create

.claude/commands/test-e2e.md — Project-level slash command (~250 lines)

Usage: /test-e2e, /test-e2e 847 (PR), /test-e2e feature/my-branch

Command Structure

Phase 1: Prerequisites

  • Verify agent-browser CLI is installed
  • Ask user: headed vs headless mode

Phase 2: Dev Server Auto-Start

  • Health check: curl -sf http://localhost:3101 >/dev/null and curl -sf http://localhost:4001/health >/dev/null
  • If not running: start via bun obvious up in background, poll health every 5s for up to 120s
  • If already running: proceed immediately

Phase 3: Determine Test Scope

  • PR number arg: gh pr view {number} --json files -q '.files[].path'
  • Branch arg: git diff --name-only main...{branch}
  • No arg (current): git diff --name-only main...HEAD

Phase 4: File-to-Route Mapping (Obvious-Specific)

Three tiers of mapping, derived from dashboard/src/pages/app.tsx:

Tier 1 — Direct page files → routes (highest confidence)

File Pattern Route
dashboard/src/pages/login-page.tsx /login
dashboard/src/pages/landing-page.tsx /landing
dashboard/src/pages/onboarding-page.tsx /onboarding
dashboard/src/pages/projects-library-page.tsx /projects
dashboard/src/pages/project-page/** /projects/:id
dashboard/src/pages/shortcuts-page/** /shortcuts
dashboard/src/pages/templates-page/** /templates
dashboard/src/pages/artifact-library-page.tsx /library
dashboard/src/pages/tasks-page/** /tasks
dashboard/src/pages/notes-page/** /notes
dashboard/src/pages/skills-page/** /skills
dashboard/src/pages/custom-modes-page/** /modes
dashboard/src/pages/settings/* /settings (+ specific sub-route)
dashboard/src/pages/workspace-usage-page/** /settings/workspace/usage

Tier 2 — Feature directories → likely routes

Feature Directory Routes
dashboard/src/features/chat/** /projects/:id
dashboard/src/features/main-sidebar/** /landing
dashboard/src/features/prompt-input/** /landing
dashboard/src/features/projects-grid/** /projects, /landing
dashboard/src/features/artifact-*/** /library, /projects/:id
dashboard/src/features/settings-layout/** /settings
dashboard/src/features/notes/** /notes
dashboard/src/features/onboarding/** /onboarding
dashboard/src/features/landing-*/** /landing
dashboard/src/features/billing-*/** /settings/workspace/billing

Tier 3 — Broad-impact changes → test core pages

Pattern Routes
dashboard/src/components/** /landing, /projects
dashboard/src/ui/** /landing
packages/** /landing, /login
apps/api/src/routes/** Corresponding dashboard page

Deduplicate routes across tiers. Cap at 8 routes max. Skip non-dashboard files (CI, docs, scripts).

Phase 5: Authentication via agent-browser

Login flow using actual form elements from login-page.tsx:

  1. agent-browser open http://localhost:3101/login
  2. agent-browser snapshot -i → get refs for email input (placeholder="Enter your email"), password input (placeholder="Enter your password"), submit button ("Sign In")
  3. agent-browser fill @emailRef "test@example.com"
  4. agent-browser fill @passwordRef "password"
  5. agent-browser click @submitRef
  6. agent-browser wait 3000 → login has a 2.3s animation before redirect
  7. Verify landed on /landing or /onboarding
  8. If /onboarding: handle form or wait for auto-redirect, then navigate to /landing
  9. Take confirmation screenshot

Session persists across all subsequent agent-browser open calls — login once, test many.

Phase 6: Test Each Route

For each mapped route:

  1. agent-browser open http://localhost:3101{route}
  2. agent-browser wait 2000 (let SSE hydration settle)
  3. agent-browser snapshot -i → verify expected elements present
  4. Check for error boundaries ("Something went wrong") or blank pages
  5. Perform one basic interaction (click a tab, expand a section) if applicable
  6. agent-browser screenshot {route-name}.png
  7. Record PASS/FAIL

For /projects/:id routes: first create a project via /landing prompt input or navigate to an existing one.

Phase 7: Failure Handling

  • On fail: take screenshot + full snapshot, ask user to fix/skip/stop
  • On timeout: mark as FAIL, continue

Phase 8: Summary Report

Generate markdown table with route, status, notes. List unmapped files. Overall PASS/FAIL/PARTIAL.

Key Design Decisions

  1. Single .claude/commands/test-e2e.md file — no code files, no configs. The command is a prompt that tells Claude how to test.
  2. Named test-e2e not test-browser — avoids collision with the generic plugin command; both remain available.
  3. Route map embedded in the command — derived from app.tsx, small enough to inline. Update when routes change.
  4. Auth via UI login, not cookie injection — agent-browser doesn't support cookie injection; login through the form also tests the auth flow.
  5. Dev server auto-start — uses bun obvious up with health polling, matching the project's existing dev tooling.
  6. Breadth over depth — verify pages load and have correct elements rather than deep workflow testing (Playwright handles that).

Critical Source Files

  • dashboard/src/pages/app.tsx — Route definitions (the route map must stay in sync)
  • dashboard/src/pages/login-page.tsx — Login form structure (placeholders, button text)
  • playwright/utils/test-helpers.ts — Reference for auth + onboarding patterns
  • playwright/AGENTS.md — Testing patterns and anti-patterns

Verification

  1. Run /test-e2e with the dev server running — should analyze current branch, map files, authenticate, test routes
  2. Run /test-e2e with server stopped — should detect and offer to start it
  3. Run /test-e2e 123 with a PR number — should fetch PR files and test affected routes
  4. Verify headed mode shows the browser window: /test-e2e → select "Headed"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment