description	argument-hint
Comprehensive requirements discovery with smart codebase analysis, dependency-ordered questions, and change propagation	"topic to clarify" [--max-iterations N]

Ralph Clarify: Smart Discovery Loop

Efficient requirements gathering with:

One-time upfront codebase analysis
Dependency-ordered questions (foundational first, dependent later)
Change propagation (update downstream decisions when you change your mind)

Parse Arguments

Arguments: $ARGUMENTS

Split arguments:

TOPIC: Everything before -- flags (what we're clarifying)
--max-iterations: Number (default: 30)

Optional: Provide Call Transcript Context

If you have a Granola transcript (or any meeting notes) from stakeholder calls, paste it after the command:

/ralph-clarify "bulk scheduling v2"

--- TRANSCRIPT ---
[Paste Granola transcript here]

The agent will:

Parse the transcript for decisions, requirements, and constraints
Pre-populate answers from the transcript
Cite the exact quote and speaker when referencing transcript content

Transcript-Backed Answers

When a question was already discussed in a call, the agent will show:

**Q4: Should we support recurring schedules?** [Depends on: D1]
- A) Yes, full recurrence (daily/weekly/monthly)
- B) Yes, simple (repeat X times)
- C) No, one-time only for MVP
- D) Other

> **Answered in call** — Tom mentioned: "Let's keep it simple for v1, 
  just one-time scheduling. We can add recurring later based on demand."
> 
> **Suggested: C** based on transcript

Accept C, or override?

This gives you:

Confidence: Decision backed by stakeholder quote
Audit trail: Know who said what and when
Speed: Skip re-discussing decided items

Setup

1. Create Feature Directory

Create a folder for this feature under prds/:

mkdir -p prds/{feature-slug}/

Example: prds/bulk-scheduling-v2/

2. Create Discovery File

Create discovery file at prds/{feature-slug}/clarify-session.md:

# Discovery: {TOPIC}

Started: {timestamp}

## Call Transcript Context (if provided)

<!-- Extracted decisions and quotes from Granola/meeting transcripts -->

### Key Decisions from Calls
| Topic | Decision | Speaker | Quote |
|-------|----------|---------|-------|
| Recurring schedules | No for MVP | Tom | "Let's keep it simple for v1..." |
| Auth approach | Use existing JWT | Sarah | "We should reuse our current auth..." |

### Unresolved from Calls
(Topics discussed but not decided - still need clarification)

---

## Codebase Context (Cached)

<!-- Populated ONCE at start by reconnaissance phase -->

### Tech Stack
(Framework, ORM, styling, state management, etc.)

### Architectural Patterns
(Data flow, API structure, component patterns)

### Relevant Existing Features
(Similar implementations we can learn from)

### Conventions
(Naming, file structure, testing patterns)

---

## Decision Log

<!-- Each decision with its dependencies -->

| ID | Question | Answer | Depends On | Impacts |
|----|----------|--------|------------|---------|
| D1 | ... | ... | - | D3, D5 |
| D2 | ... | ... | - | D4 |
| D3 | ... | ... | D1 | D7 |

## Pending Questions

<!-- Ordered by dependency - can't ask until prerequisites answered -->

## Change History

<!-- Track when foundational decisions change and what got updated -->

## Final Requirements

(Synthesized from all decisions)

Phase -1: Process Transcript (if provided)

If user pasted a Granola transcript or meeting notes, process it FIRST:

Extract from Transcript

Identify speakers - Who was in the call?
Extract decisions - What was explicitly decided?
Extract requirements - What constraints or must-haves were mentioned?
Extract open questions - What was discussed but NOT decided?
Note exact quotes - Preserve verbatim quotes with speaker attribution

Populate Session File

Add extracted info to "Call Transcript Context" section:

## Call Transcript Context

### Key Decisions from Calls
| Topic | Decision | Speaker | Quote |
|-------|----------|---------|-------|
| MVP scope | One-time scheduling only | Tom | "Let's keep it simple for v1, just one-time scheduling" |
| Auth | Reuse existing JWT | Sarah | "We should definitely reuse our current auth system" |
| UI pattern | Use modal wizard | Luca | "Multi-step modal, max 3 things per screen" |

### Unresolved from Calls
- Data retention policy - discussed but no decision
- Whether to support draft posts - mentioned but tabled

During Questioning

When you reach a question covered by the transcript:

**Q4: Should we support recurring schedules?** [Depends on: D1]
- A) Yes, full recurrence
- B) Yes, simple repeat
- C) No, one-time only for MVP  
- D) Other

> **Answered in call** — Tom mentioned: "Let's keep it simple for v1, 
  just one-time scheduling. We can add recurring later based on demand."
> 
> **Suggested: C** based on transcript

Accept C, or override with different choice?

DO NOT skip the question entirely - always give the user a chance to override if context has changed since the call.

Phase 0: Reconnaissance (ONE TIME ONLY)

CRITICAL: Do this ONCE at session start, NOT per question.

Fire these 6 agents IN PARALLEL, wait for ALL to complete, then cache results:

// Fire all at once - this is your ONE codebase analysis burst
background_task(agent="explore", prompt=`
  Analyze the overall architecture for a feature about: {TOPIC}
  
  Return:
  1. Tech stack (framework, language, ORM, UI library)
  2. Project structure (where do features live?)
  3. Data flow pattern (how do API routes connect to DB?)
  4. Key directories for this type of feature
`)

background_task(agent="explore", prompt=`
  Find existing features SIMILAR to: {TOPIC}
  
  Return:
  1. List of 2-3 similar features already built
  2. What patterns do they use?
  3. What utilities/helpers do they share?
  4. Any gotchas or technical debt noted in comments?
`)

background_task(agent="explore", prompt=`
  Analyze the UI/UX patterns in this codebase:
  
  Return:
  1. Component library used (shadcn, radix, custom?)
  2. Styling approach (tailwind, css modules, styled-components?)
  3. Common UI patterns (modals, forms, lists)
  4. State management approach (context, zustand, redux?)
`)

background_task(agent="explore", prompt=`
  Analyze the backend/API patterns:
  
  Return:
  1. API style (REST, GraphQL, tRPC?)
  2. Auth pattern (JWT, session, OAuth?)
  3. Database access pattern (raw SQL, ORM, which one?)
  4. Background job handling (if any)
`)

background_task(agent="explore", prompt=`
  Analyze caching patterns in this codebase:
  
  Return:
  1. Client-side caching (React Query, SWR, manual?)
  2. Server-side caching (Redis, in-memory, CDN?)
  3. Cache invalidation patterns used
  4. Stale-while-revalidate usage?
  5. Any TTL conventions?
`)

background_task(agent="explore", prompt=`
  Analyze security patterns in this codebase:
  
  Return:
  1. Input validation approach (zod, yup, manual?)
  2. Authorization patterns (middleware, per-route, RBAC?)
  3. Rate limiting implementation (if any)
  4. Sensitive data handling patterns
  5. CSRF/XSS protections in place
`)

Store ALL results in "Codebase Context (Cached)" section. Reference this cache for ALL subsequent questions - DO NOT fire more agents unless you hit a completely new area not covered.

Total reconnaissance: 6 parallel agent calls. This covers 95% of questions you'll ask.

Phase 1: Question Planning

Before asking ANY questions, plan the full question tree:

1. Identify Question Categories for {TOPIC}

MANDATORY: Cover ALL of these categories. Don't skip any.

A. Core Scope & Purpose

What problem does this solve? Why now?
What's the MVP vs full vision?
What's explicitly OUT of scope?
Success metrics - how do we know it's working?

B. Users & Access

Who are the primary users? Secondary?
User skill level / technical sophistication?
Access control - who can see/do what?
Multi-tenancy considerations?

C. Data & Storage

What data entities are involved?
New tables or extend existing?
Data relationships and foreign keys?
Data retention - how long do we keep it?
Soft delete vs hard delete?
Audit trail requirements?

D. Security (CRITICAL)

Authentication requirements for this feature?
Authorization - role-based, resource-based, or both?
Input validation - what can go wrong?
SQL injection, XSS, CSRF considerations?
Sensitive data handling (PII, credentials)?
Rate limiting needed?
API key / token exposure risks?

E. Caching - Performance

What queries will be expensive?
Cache at which layer? (DB, API, CDN, client)
Cache invalidation strategy?
TTL for different data types?
Warm cache vs cold cache behavior?

F. Caching - UX & Data Freshness

Which screens need real-time fresh data?
Which screens can tolerate stale data? How stale?
Optimistic updates - show before confirmed?
Loading states vs skeleton screens vs stale-while-revalidate?
Offline support needed?
What happens when cached data conflicts with server?

G. Performance & Scaling

Expected data volume? (rows, requests/sec)
Read-heavy or write-heavy?
Need for pagination? Cursor vs offset?
Database indexing strategy?
N+1 query risks?
Background job considerations?
Horizontal scaling implications?

H. Edge Cases & Error Handling

What if the user is offline?
What if the request times out?
What if data is partially saved?
Concurrent edit conflicts?
Race conditions?
Retry logic - idempotency?
Graceful degradation when dependencies fail?

I. Integration Points

Which existing systems does this touch?
External APIs involved?
Webhook requirements (inbound/outbound)?
Event bus / message queue needs?
Third-party service failure handling?

J. UI/UX Patterns

Modal, page, or drawer?
Mobile-first considerations?
Accessibility requirements?
Loading/empty/error states?
Undo/redo support?
Keyboard shortcuts?

K. Testing & Observability

Unit test coverage expectations?
Integration test needs?
E2E test scenarios?
Logging requirements?
Metrics to track?
Alerting thresholds?

L. Deployment & Rollout

Feature flag needed?
Gradual rollout strategy?
Rollback plan if it breaks?
Database migration risks?
Backward compatibility requirements?

M. Future Considerations

What's the next iteration likely to need?
Are we painting ourselves into a corner?
Technical debt we're knowingly accepting?

2. Build Dependency Graph

For each question, identify:

Depends On: Which questions must be answered first?
Impacts: Which questions' recommendations will change based on this answer?

Example dependency graph:

D1: "What's the core purpose?" (foundational - no deps)
  └─> D4: "What data do we need to store?" (depends on D1)
      └─> D7: "Should we use existing tables or new ones?" (depends on D4)
      
D2: "Who are the users?" (foundational - no deps)
  └─> D5: "What permissions model?" (depends on D2)
  
D3: "MVP or full feature?" (foundational - no deps)
  └─> D6: "Include edge case X?" (depends on D3)
  └─> D8: "Support feature Y?" (depends on D3)

3. Create Prioritized Queue

Order questions:

Foundational (no dependencies) - ask first
Dependent (has prerequisites) - ask after deps satisfied
Nice-to-have (optional details) - ask last

Store this in "Pending Questions" section with their dependency IDs.

Phase 2: Iterative Questioning

Each Iteration

Read prds/{feature-slug}/clarify-session.md - check Decision Log and Pending Questions
Find the NEXT question(s) whose dependencies are ALL satisfied
Ask 2-4 questions (batch questions at the same dependency level)
For each question, provide recommendation FROM CACHED CONTEXT:

**Q3: How should we store the scheduling data?** [Depends on: D1]
- A) New database table with Drizzle schema
- B) JSON field on existing Posts table
- C) External service
- D) Other

> **Recommended: A** — From cached analysis: Your codebase uses Drizzle 
  consistently (`lib/db/schema.ts`). Similar feature "content-calendar" 
  uses dedicated tables. Option B would violate the normalized pattern.

[Impacts: D7, D12 - will affect storage queries and migration strategy]

STOP and wait for user response
After user answers:
- Update Decision Log with answer
- Check if answer differs significantly from recommendation
- If foundational decision changed → trigger change propagation (see below)
- Mark dependent questions as "ready" in Pending Questions

When to Fire ADDITIONAL Agents (Rare)

Only fire new agents if user's answer takes you into UNEXPLORED territory:

// User picked option D (Other) with something unexpected
// OR user wants to integrate with a service not in your cached analysis

background_task(agent="explore", prompt=`
  The user wants to use [UNEXPECTED_CHOICE] for {TOPIC}.
  Find any existing usage of this in the codebase, or similar patterns
  we could adapt. This is NEW territory not covered in initial analysis.
`)

Rule: If you can answer from cached context, DO NOT fire agents.

Phase 3: Change Propagation

When user changes their mind on a foundational decision:

Detect Significant Changes

A change is "significant" if:

It's a foundational question (others depend on it)
The new answer contradicts the previous answer (not just a refinement)
It invalidates the reasoning for downstream decisions

Propagation Protocol

Identify Affected Decisions

User changed D1 from "A" to "C"

Checking impact chain:
- D4 depends on D1 → NEEDS REVIEW (recommendation was based on old D1)
- D7 depends on D4 → NEEDS REVIEW (transitively affected)
- D12 depends on D1 → NEEDS REVIEW

Flag in Decision Log

| ID | Question | Answer | Status |
|----|----------|--------|--------|
| D1 | Core purpose | C (CHANGED from A) | ✓ |
| D4 | Data storage | B | ⚠️ NEEDS REVIEW - D1 changed |
| D7 | Table design | A | ⚠️ NEEDS REVIEW - D4 affected |

Present Review Batch

You changed D1 (core purpose) from "internal tool" to "client-facing feature".

This affects these previous decisions - let's review:

**D4 (Data storage)** - Previously answered: B (JSON field)
- Old reasoning: "For internal tools, JSON is fine for quick iteration"
- New context: Client-facing features need proper schema for reliability
- **New Recommendation: A** (dedicated table)
- Keep B, or switch to A?

**D7 (Table design)** - Previously answered: A (minimal fields)
- Depends on D4, will re-ask after D4 is confirmed

Log Changes

## Change History

### Change #1 (timestamp)
- D1: "A" → "C" (user initiated)
- D4: "B" → "A" (propagated, user confirmed)
- D7: "A" → "B" (propagated, user confirmed)

Question Format

**Q[N]: [Question text]** [Depends on: D1, D3] [Impacts: D7, D8]
- A) [Option]
- B) [Option]
- C) [Option]  
- D) Other

> **Recommended: [X]** — [Reasoning from CACHED context, cite specific 
  files/patterns found in reconnaissance phase. Max 2 sentences.]

Completion Criteria

Output CLARIFIED when:

All questions in dependency graph answered
No pending reviews from change propagation
Decision Log is complete and consistent
Final Requirements section synthesized

Confirmation Message

Starting smart discovery for: {TOPIC}

Phase 0: Running one-time codebase reconnaissance...
[Firing 4-6 parallel agents - this is the ONLY agent burst]

⏳ Analyzing tech stack...
⏳ Finding similar features...
⏳ Checking UI/UX patterns...
⏳ Reviewing API patterns...

✓ Codebase context cached. No more agent calls needed unless we hit new territory.

Phase 1: Building question dependency graph...
✓ Identified [N] questions across [M] categories
✓ Mapped dependencies - foundational questions first

Discovery file: prds/{feature-slug}/clarify-session.md

Commands:
- Answer with letters: "A" or "A, B" for multi-select
- Change your mind: "change D3 to B" - I'll propagate updates
- Stop early: "enough" or "done"

Beginning with foundational questions...

Efficiency Summary

Action	When	Cost
Reconnaissance burst	Once at start	6 agent calls
Per-question agents	Never (use cache)	0
New territory agents	Only if user picks unexpected path	1 agent call
Change propagation	Only on foundational changes	0 (uses existing data)

Total typical session: 6-8 agent calls vs. 80-100+ with naive approach

Workflow: Answer What You Can, Flag What You Can't

Not every question can be answered immediately. Engineers should:

During the Session

Answer confidently - Questions you know the answer to

Mark as TBD - Questions needing team input:

Q: What's the data retention policy?
Answer: TBD - Need to check with compliance team

Flag dependencies - Questions blocked on external decisions:

Q: Which auth provider?
Answer: BLOCKED - Waiting on security team's vendor review

After the Session

Review TBD items - Share with relevant stakeholders

Update the session file - Fill in answers as you get them:

# Edit the clarify session
vim prds/{feature-slug}/clarify-session.md

# Or re-run ralph-clarify to update interactively
/ralph-clarify "{feature}" --continue

Track completion - Session is "done" when no TBDs remain

Decision Log Status Values

Status	Meaning
✓	Decided and confirmed
TBD	Needs team input - flag who to ask
BLOCKED	Waiting on external decision
⚠️ REVIEW	Affected by upstream change

File Organization

All clarify sessions live under prds/:

prds/
├── bulk-scheduling-v2/
│   ├── clarify-session.md     <- This command's output
│   └── prd.md                 <- PRD (written from clarify session)
│
├── analytics-dashboard/
│   ├── clarify-session.md
│   └── prd.md
│
└── content-calendar-v1-overhaul/
    ├── clarify-session.md
    └── prd.md

Workflow: clarify-session.md → prd.md → implementation

The clarify session captures exhaustive requirements. The PRD synthesizes them into an actionable spec (with summary at the top). No separate summary file needed.

Coverage Checklist

Before outputting <promise>CLARIFIED</promise>, verify you've asked about:

ObaidUr-Rahmaan/ralph-clarify.md