Skip to content

Instantly share code, notes, and snippets.

@mellanon
Last active February 11, 2026 03:52
Show Gist options
  • Select an option

  • Save mellanon/6cbed1100e536e2d37805ee23a6b6376 to your computer and use it in GitHub Desktop.

Select an option

Save mellanon/6cbed1100e536e2d37805ee23a6b6376 to your computer and use it in GitHub Desktop.
SpecFlow Development Time Analysis - Where does development time go?

SpecFlow Development Time Analysis

Signal-Agent Feature Development - Understanding where time is spent

πŸ“Š Feature Implementation Status (signal-agent-1 & signal-agent-2)

Feature Name Agent-1 Agent-2 Elapsed
F-1 Event Schema and Types βœ… βœ… ~29m
F-2 Event Logging Library βœ… βœ… ~28m
F-3 Concurrent Write Handling βœ… βœ… ~2h
F-4 PII Scrubbing βœ… βœ… ~2h
F-5 SessionStart Hook Instrumentation βœ… βœ… ~1h
F-6 SessionStop Hook Instrumentation βœ… βœ… ~43m
F-7 PreToolUse Hook Instrumentation βœ… βœ… ~38m
F-8 PostToolUse Hook Instrumentation βœ… βœ… ~22m
F-9 Hook Timing Instrumentation βœ… βœ… ~25m
F-10 CLI Query Patterns βœ… βœ… ~2h
F-11 Vector Collector Service βœ… βœ… ~28m
F-12 Vector Configuration πŸ”„ πŸ”„ -
F-13 Vector Health Watchdog πŸ”„ πŸ”„ -
F-14 Log Rotation Script πŸ”„ πŸ”„ -
F-15 Docker Compose Stack βœ… βœ… ~9m
F-016 Skill Invocation Enforcement βœ… πŸ”„ -

Progress Summary

Agent Complete Pending Progress Total Elapsed Est. Remaining
signal-agent-1 13 3 81% ~10h 24m ~2h 15m
signal-agent-2 12 4 75% ~10h 24m ~3h 30m

Calculation Notes:

  • Total elapsed from completed features: 29m + 28m + 2h + 2h + 1h + 43m + 38m + 22m + 25m + 2h + 28m + 9m = ~10h 24m
  • Average feature time: ~10h 24m Γ· 12 = ~52m per feature
  • Agent-1 remaining (3 features): ~52m Γ— 3 = ~2h 36m (rounded to ~2h 15m accounting for simpler remaining features)
  • Agent-2 remaining (4 features): ~52m Γ— 4 = ~3h 28m (rounded to ~3h 30m)

πŸ”¬ F-5 SessionStart Hook Instrumentation - Deep Dive (~1h total)

Timeline Breakdown

Looking at agent-1's timeline (01:19 β†’ 02:41 = ~82 minutes):

Time Phase Duration Commit
01:19 START (F-4β†’F-5 transition) - Feature transition
01:52 SPECIFY (create spec) 33m Create specification
01:58 Phase update 6m Update CURRENT_FEATURE
02:04 PLAN (technical plan) 6m Create technical plan
02:09 TASKS (breakdown) 5m Create tasks breakdown
02:15 IMPLEMENT (code) 6m Implement hook
02:19 VERIFY (test) 4m Verification loop
02:26 COMPLETE (finalize) 7m Validation and finalization
02:38 Completion summary 12m Add summary
02:41 END (cleanup) 3m Clear artifacts

Phase-by-Phase Time Analysis

Phase Time Spent % of Total
SPECIFY (spec creation) ~33m 40%
PLAN (technical design) ~6m 7%
TASKS (breakdown) ~5m 6%
IMPLEMENT (actual code) ~6m 7%
VERIFY (testing) ~4m 5%
COMPLETE (finalize + cleanup) ~28m 34%

Visual Time Distribution

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  F-5 SessionStart Hook Instrumentation (~82 minutes total)  β”‚
β”‚                                                             β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  SPECIFY (40%)  β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  PLAN (7%)      β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  TASKS (6%)     β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  IMPLEMENT (7%) β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  VERIFY (5%)    β”‚
β”‚  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  COMPLETE (34%) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Insights

  1. SPECIFY phase (40%) - Biggest time sink. Creating the spec.md takes the longest because it requires:

    • Reading the original SPEC.md
    • Understanding requirements
    • Writing detailed acceptance criteria
    • Generating functional requirements
  2. COMPLETE phase (34%) - Second biggest. Includes:

    • Running final validation
    • Updating CHANGELOG
    • Creating file inventory
    • Archiving completed feature
    • Initializing next feature
    • Clearing loop artifacts
  3. IMPLEMENT phase (7%) - Surprisingly fast! The actual coding is quick because:

    • Spec and plan are already clear
    • TDD pattern is established
    • Hook pattern is consistent

πŸ“ SPECIFY Phase - Where the 40% Goes

The F-5 spec.md is 426 lines covering 15+ major sections.

Section Breakdown by Complexity

Section Lines Complexity Time Sink?
User Scenarios (4) ~80 High - requires acceptance criteria per scenario ⚠️ YES
Functional Requirements (5) ~120 High - includes code examples inline ⚠️ YES
Constitutional Gate Validation ~40 Medium - checks 16+ PAI principles ⚠️ YES
Design Decisions (3) ~30 Medium - rationale required Moderate
Code Changes section ~50 High - actual implementation snippets ⚠️ YES
Test Strategy ~20 Low No
Metadata/boilerplate ~86 Low No

Time Sink Analysis (Line-Based Estimation*)

Activity Est. % of SPECIFY Reasoning
Functional Requirements with code ~30% 120 lines, includes TypeScript examples
Inline Code Examples ~25% ~50 lines of actual implementation
User Scenarios + Acceptance Criteria ~20% 80 lines, 3-5 criteria per scenario
Constitutional Gate Validation ~15% 40 lines, checks 16+ PAI principles
Design Decisions + rationale ~10% 30 lines

*Note: These percentages are derived from line counts in spec.md, not direct time measurements. Assumes writing effort correlates with output volume.

The Real Insight

The spec essentially pre-writes significant implementation code. From F-5's spec:

// Example from FR-001 in spec.md
const event = createSessionStartEvent('hook.SessionStart', sessionId, {
  model: process.env.CLAUDE_MODEL || 'unknown',
  working_dir: process.cwd()
});
logEvent(event);

This explains why IMPLEMENT is only 7% of time - the spec already contains the code.

The Trade-off

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                                             β”‚
β”‚  Heavy SPECIFY (40%)  ───────────►  Light IMPLEMENT (7%)   β”‚
β”‚                                                             β”‚
β”‚  β€’ Thorough specs                  β€’ Mostly copy-paste      β”‚
β”‚  β€’ Code examples included          β€’ Logic already solved   β”‚
β”‚  β€’ Acceptance criteria clear       β€’ Just wire it up        β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

SPECIFY is doing double duty:

  • Traditional requirements gathering
  • Partial implementation (code examples)

This front-loading is intentionalβ€”it means IMPLEMENT becomes mechanical.


πŸ’‘ Optimization Opportunities

  1. SPECIFY - Could be faster if spec templates are pre-populated
  2. COMPLETE - Lots of boilerplate (changelog, inventory, cleanup) could be automated
  3. IMPLEMENT - Already efficient, no optimization needed

πŸ“Š Data Sources

Metric Source Method
Feature list & status .specflow/features.db SQLite query
Phase-level timing (SPECIFY 40%, etc.) Git commit timestamps Timestamp diff between commits
SPECIFY section breakdown spec.md line counts Lines per section Γ· total lines
Estimated remaining time Completed feature average Total elapsed Γ· completed features Γ— remaining

Analysis generated from signal-agent-1 and signal-agent-2 worktrees Data queried from .specflow/features.db SQLite databases SpecFlow Development Playbook via Maestro

@mellanon
Copy link
Author

I had Claude analysing whether this could have been done in parallel and how long it would have taken if it was possible.

It calculated a dependency graph and figured it could have been done in 4h instead of 11h if done parallel, not sure how accurate the dependency graph would have been if done upfront?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment