Skip to content

Instantly share code, notes, and snippets.

@leogodin217
Created February 14, 2026 13:03
Show Gist options
  • Select an option

  • Save leogodin217/c505388a1c6b9e9a29c35ab4b9a79c27 to your computer and use it in GitHub Desktop.

Select an option

Save leogodin217/c505388a1c6b9e9a29c35ab4b9a79c27 to your computer and use it in GitHub Desktop.
v3-implement-sprint.md
# Implement Sprint
Automated sprint implementation with context discipline and architectural quality. Runs all phases without stopping, presents results for approval.
## Prerequisites
- Sprint spec exists at `docs/sprints/current/spec.md`
- Sprint state at `docs/sprints/current/state.yaml`
- Contracts are fully defined (no ambiguity)
- Clean git state (commit or stash any work first)
## Context Budget Rules
These rules prevent context exhaustion. Follow them exactly.
### Rule 1: Orchestrator Reads Minimally
You (the orchestrator) read ONLY these files:
| File | Purpose |
|------|---------|
| `docs/sprints/current/spec.md` | Sprint spec (required) |
| `docs/sprints/current/state.yaml` | Phase status (required) |
**Do NOT read source files, architecture docs, config models, or any `.py` file.** The implementer agent reads what it needs. Your job is orchestration, not comprehension.
### Rule 2: Subagent Prompts Are Brief
Pass to the implementer agent:
- Phase number and title (from spec)
- Sprint spec path: `docs/sprints/current/spec.md`
- One sentence summarizing the phase goal
- The quality rules block (standardized, see template below)
**Do NOT paste code, file contents, contracts, or implementation details into prompts.** The implementer reads the spec itself.
### Rule 3: TaskOutput — Never Double-Read
When calling TaskOutput for a subagent:
- Use `timeout: 600000` (10 minutes) on the FIRST call
- If it times out, call TaskOutput again with `timeout: 600000`
- **NEVER call TaskOutput on a subagent that already returned a result** — even partial
One TaskOutput call per subagent. Period.
### Rule 4: No Accumulation Between Phases
After committing a phase, you have the git history as your record. Do NOT retain mental summaries of what each phase did. The commit messages and state.yaml track progress.
## Automation Model
```
/implement-sprint
|
+--------------------------------------+
| AUTOMATED (no human intervention) |
| |
| For each phase: |
| 1. Implement (implementer agent) |
| 2. Run pre-commit |
| 3. Run tests |
| 4. Review (reviewer agent) |
| 5. Run demo |
| 6. Analyze data (if applicable) |
| 7. Commit phase |
| |
| After all phases: |
| 8. Run review-sprint |
| 9. Run all demos twice |
| 10. Run completion checks |
+--------------------------------------+
|
+--------------------------------------+
| PRESENT TO USER |
| |
| - Summary of what was built |
| - Test results + coverage |
| - Review report (issues found) |
| - Demo output samples (both runs) |
+--------------------------------------+
|
User Decision
|
+-------------+-------------+------------------+
| ACCEPT | FIX | RESET |
| | | |
| Keep | Address | git reset to |
| commits | issues, | pre-sprint HEAD |
| | amend/fix | Try again |
+-------------+-------------+------------------+
```
**Key principle:** Commit after each phase for clean git history. Reset reverts all phase commits if needed.
## Process Details
### Phase Implementation (Automated)
For each phase in the sprint:
#### Step 1: Implement
Launch the **implementer** agent. Your prompt MUST follow this template exactly:
```
Implement Phase {N}: {Title} for the current sprint.
Sprint spec: docs/sprints/current/spec.md
Read the spec, focus on Phase {N}. Read source files as needed.
QUALITY RULES (mandatory):
- Decompose into module-level functions, not inner closures — helpers must be independently testable
- If two modules share logic, extract it into a shared module (DRY)
- Check every success criterion in the spec — missing deliverables (examples, configs) count as failures
- Use TYPE_CHECKING for type-only imports; keep runtime imports minimal
- Remove stale comments (sprint changelog comments, "# Future:", scaffolding markers)
- No code duplication — if you copy-paste and modify, extract a shared function instead
OUTPUT RULES (mandatory):
- Final response MUST be under 2000 characters
- List files modified/created (paths only, no contents)
- Test result: pass count and fail count
- One-line summary per file change
- Do NOT include code snippets, stack traces, or implementation details
- Do NOT re-read files you have already read
- If a demo fails after 3 attempts, report the error and stop
ITERATION LIMITS:
- Max 3 attempts to fix any single failing test
- Max 3 attempts to fix a demo script
- Never read the same file more than twice
```
**Do NOT add anything else to the prompt.** No file contents, no contracts, no code patterns.
Use `run_in_background: true`. Then call TaskOutput with `timeout: 600000`.
#### Step 2: Pre-commit
```bash
pre-commit run --all-files
```
**If pre-commit fails:** Fix issues yourself (these are usually formatting) and re-run. Do not launch a subagent for this.
#### Step 3: Tests
```bash
python -m pytest --cov=src/fabulexa --cov-fail-under=85
uv pip install -e .
```
**If tests fail:** Stop automation, report failure to user.
#### Step 4: Review
Launch the **reviewer** agent. Brief prompt:
```
Review Phase {N}: {Title} of the current sprint.
Sprint spec: docs/sprints/current/spec.md
Focus on Phase {N} only.
Check the git diff for this phase:
git diff HEAD~1
QUALITY CHECKS (in addition to standard review):
- Are helpers module-level functions (not closures/nested defs)?
- Any code duplication over 10 lines?
- Any success criteria from the spec not delivered?
- Any runtime imports that should be TYPE_CHECKING only?
- Any stale sprint comments left in modified files?
OUTPUT RULES (mandatory):
- Final response MUST be under 1500 characters
- Use the APPROVED or REVISIONS NEEDED format from your instructions
- List only violations with file:line, not explanations of principles
```
Use `run_in_background: false` (reviewer is fast, no need for background).
**If revisions needed:** Fix issues yourself using the reviewer's file:line references. Re-run gates. Re-review. Limit: 3 attempts per phase.
#### Step 5: Run Demo
```bash
python docs/sprints/current/demos/phase_N_*.py
```
Demo must run without errors and exit 0.
#### Step 6: Analyze Data (If Applicable)
If the demo produces simulation output, launch the **data-analyst** agent:
```
Analyze output from Phase {N} demo.
Demo script: docs/sprints/current/demos/phase_{N}_*.py
Run the standard validation workflow from your instructions.
OUTPUT RULES (mandatory):
- Final response MUST be under 1500 chars
- Use VALIDATION: PASS or VALIDATION: ISSUES FOUND format
```
#### Step 7: Commit Phase
Update state.yaml to show status.
```bash
git add -A
git commit -m "$(cat <<'EOF'
Sprint [Name] - Phase N: [Title]
- [What this phase implemented]
- Tests: PASS
- Pre-commit: PASS
Co-Authored-By: Claude <noreply@anthropic.com>
EOF
)"
```
### Post-Implementation (Automated)
After all phases complete:
#### Step 8: Review Sprint
Run `/review-sprint` for comprehensive fresh-eyes audit.
#### Step 9: Run All Demos Twice
```bash
for demo in docs/sprints/current/demos/phase_*.py; do
python "$demo"
done
for demo in docs/sprints/current/demos/phase_*.py; do
python "$demo"
done
```
Both runs must produce consistent output.
#### Step 10: Completion Checks
```bash
pre-commit run --all-files
git status --porcelain | grep "^??"
grep -rn "# Future:" src/
grep -rn "pass$" src/
```
### Present to User
Show:
1. **Summary:** What was built, files changed
2. **Tests:** Pass/fail count, coverage percentage
3. **Review report:** Issues found
4. **Demos:** Sample output proving it works
### User Decision
#### ACCEPT
Commits already made per-phase. Update state.yaml and archive sprint.
#### FIX
1. Fix the specific issues
2. Run pre-commit
3. Commit the fix
4. Re-run review-sprint
5. Present again
#### RESET
```bash
git log --oneline -10
git reset --hard <pre-sprint-commit>
git clean -fd
```
## Context Budget Summary
| What | Budget |
|------|--------|
| Orchestrator file reads | 2 files (spec + state.yaml) |
| Implementer prompt | Template only, no pasted content |
| Reviewer prompt | Template only, no pasted content |
| Data-analyst prompt | Template only, no pasted content |
| TaskOutput calls per subagent | Exactly 1 |
| TaskOutput timeout | 600000ms (10 min) |
| Implementer final response | < 2000 chars |
| Reviewer final response | < 1500 chars |
| Data-analyst final response | < 1500 chars |
## Agent Summary
| Agent | Role | When | Background |
|-------|------|------|------------|
| **implementer** | Write code, tests, demos | Each phase | Yes (600s timeout) |
| **reviewer** | Check quality, find issues | After each phase | No (foreground) |
| **data-analyst** | Verify output realism | After demos with output | No (foreground) |
## Failure Modes
**Gate fails:** Report which gate, what error. User decides: fix or reset.
**Reviewer rejects 3 times:** Report the persistent issue. User decides: fix manually or reset.
**Demo crashes:** Report stack trace. User decides: fix or reset.
**Data analyst finds anomalies:** Report findings. User decides: acceptable or not.
## Tips
**For clean runs:**
- Ensure spec is unambiguous before starting
- Use a dedicated branch for risky sprints
- Keep sprints small (easier to reset)
**If you keep resetting:**
- The spec may be flawed (ambiguous, contradictory)
- Consider running `/plan-sprint` again
- Or manually implement the tricky part first
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment