Matthew Gerstman matthew-gerstman

Project Folders Cleanup Plan

Context

The project folders feature (workspace-scoped folders that organize projects) shipped as an MVP and has accumulated technical debt. The three main areas needing attention are: incomplete error handling in dashboard components, service-layer code quality issues, and gaps in test coverage. This plan covers a full cleanup pass.

Commits

Upgrade CLI to React 19 + Ink 6

Context

The CLI is isolated from the monorepo (excluded from workspaces in root package.json, has its own cli/bun.lock) solely because Ink 5 required React 18 while the dashboard uses React 19. Ink 6 now supports React 19, so the version conflict no longer exists. This upgrade unblocks agents that trip over the React 18 isolation and simplifies the monorepo setup.

Commits

feat(cli): upgrade to React 19 + Ink 6
refactor(cli): integrate CLI into monorepo workspaces

SSE Resilience Improvements

Context

The SSE implementation is already well-architected with exponential backoff, heartbeat monitoring, event replay via lastEventId, and Redis-backed connection tracking. However, there are gaps in failure recovery paths — specifically around unbounded replay queries, silent replay failures, a heartbeat race condition, and lack of Redis subscriber health monitoring. These gaps mean that after prolonged disconnections or infrastructure hiccups, clients can end up with stale data and no way to detect or recover from it.

Improvements (ranked by impact-to-risk ratio)

1. Bound the event replay query + notify client of truncation

Problem: getRecentEventsFromDatabase() has no LIMIT clause. A reconnecting client with a stale/missing lastEventId can trigger an unbounded query returning all events for a topic. One reconnecting client can spike database load for everyone.

Abstract Messaging: Slack → Multi-Platform (SMS/Twilio first) with Group Chat

Commits

feat: add messaging platform types, adapter interface, and registry
feat: add database tables for generic messaging connections and links
feat: add thread/message service methods for generic messaging links
refactor: create Slack messaging platform adapter wrapping existing code
feat: add Twilio SMS/Conversations service wrapper and adapter
feat: add SMS inbound webhook route and connection setup
feat: add messaging event dispatcher with group chat support

Obvious Monorepo — Comprehensive Test Coverage Plan

Date: 2026-02-08 Current coverage: ~25-30% of source files have corresponding tests Coverage enforcement: None (no thresholds, no CI reporting)

Executive Summary

SSE Event Stream Performance Analysis

Date: February 8, 2026 Runtime: Bun 1.3.9 (darwin arm64) Method: Custom stress test suite using Bun's built-in --cpu-prof-md and --heap-prof-md profiling tools

Executive Summary

Plan: Project-Specific E2E Browser Testing with agent-browser

Summary

Create a single project-level slash command .claude/commands/test-e2e.md that uses agent-browser CLI to intelligently test dashboard pages affected by the current PR/branch. It auto-starts the dev server if needed, authenticates as the test user, maps changed files to routes, and tests each affected page.

Commits

feat: add /test-e2e command for agent-browser E2E testing

	# SSE Resilience Improvements — Future PRs

	Findings from investigating dropped agent events in PR #6692.

	## 1. Event replay gap for thread status updates (HIGH)

	`emitAgentStatusEvent` in `apps/api/src/agents/obvious-v2/state/events.ts` calls
	`publishToProjectUsers` without a `tx` parameter, so `thread:updated` events aren't
	stored for replay. If a client reconnects during an agent run, they miss status
	transitions and the UI gets stuck on "thinking" or "running".

	# Follow-up: Claude Bot Review Comments on PR #6641

	## Context
	PR #6641 adds employee mode credit bypass. Claude bot flagged three issues.
	Two were addressed in the PR; the "critical" one is a false positive.

	## 🔴 "Critical": Per-step credit deduction not bypassed
	Status: False positive — no action needed

	Claude claims the per-step deduction block still runs for employee mode.

	# E2E Shard 2/4 Flaky Failure: Drizzle v1 "decoder" Bug

	## Problem
	E2E Shard 2/4 fails intermittently across multiple PRs (confirmed on #6498 and #6548).
	The root cause is the drizzle-orm v1 (beta.15) "decoder" bug affecting E2E setup scripts.

	## Error
	```
	❌ Failed to grant feature flag: Unknown relational filter field: "decoder"
	❌ Failed to set user context: Unknown relational filter field: "decoder"