You are building a knowledge asset for the team—capturing insights that save time for new developers ramping up AND experienced developers debugging at 2am.
Your goal is NOT to document every file. It's to capture the understanding that takes significant time to acquire and would otherwise be lost.
Capture two types of knowledge equally:
Strategic (the "why"):
- Why the technology stack was chosen over alternatives
- Why architectural decisions were made, and what was rejected
- Business rules and domain logic—and their origins
- Trade-offs that shaped the current design
- Historical context: "We tried X and it failed because Y"
Tactical (the "how" and "watch out"):
- Non-obvious behaviors and edge cases
- Gotchas and pitfalls that have bitten the team
- Patterns and conventions specific to this codebase
- Performance traps and how to avoid them
- Workarounds for framework/library quirks
For every insight, ask: "Would this save someone time?"
- File listings or directory structures (that's what
lsis for) - Function signatures (that's what the code is for)
- Obvious things that are clear from reading the code
- Standard documentation that's easily searchable
- 1:1 mappings of files to engrams
- Generic "overview" or "architecture" engrams (see Super-Hub warning below)
cog_remember(term, definition, keywords, long_term)- Store a concept/insight. Uselong_term: trueduring bootstrap. Always includekeywords.cog_recall(query)- Search existing memories (check before adding)cog_associate(source_id, target_id, predicate)- Connect conceptscog_connections(engram_id)- See connections from a concept
During codebase exploration, the knowledge you capture is verified by reading the code—it's not speculative. Use the long_term: true flag to create permanent memories directly, and always include keywords for recall matching:
🧠 Recording to Cog...
cog_remember({
"term": "Why PostgreSQL Over MySQL",
"definition": "PostgreSQL was chosen for JSONB support in the billing domain.
MySQL was evaluated but its JSON type lacks indexing. This decision enables
storing flexible invoice line items without schema migrations.",
"keywords": ["postgresql", "mysql", "jsonb", "billing", "database", "schema"],
"long_term": true
})
Keywords are critical. They power cog_recall matching and enable automatic association discovery. Include 3-7 domain-specific terms per engram.
Before any exploration, understand codebase scale and identify the domain.
# Count source files (adjust extensions for your language)
find . -type f \( -name "*.ex" -o -name "*.exs" \) \
-not -path "*/_build/*" -not -path "*/deps/*" -not -path "*/node_modules/*" | wc -l
# Count lines of code
find . -type f \( -name "*.ex" -o -name "*.exs" \) \
-not -path "*/_build/*" -not -path "*/deps/*" -not -path "*/node_modules/*" \
-exec cat {} + | wc -l
# List top-level directories
ls -d */Language-specific extensions:
| Language | Extensions |
|---|---|
| Elixir | *.ex, *.exs |
| Python | *.py |
| JavaScript/TypeScript | *.js, *.ts, *.jsx, *.tsx |
| Go | *.go |
| Rust | *.rs |
| C/C++ | *.c, *.h, *.cpp, *.hpp |
| Java | *.java |
| Ruby | *.rb |
| Zig | *.zig |
Before exploring code, establish context:
- What problem does this software solve?
- Who are the users/actors?
- What are the core domain concepts?
This domain awareness improves the quality of every engram you create. Terms grounded in the domain ("Billing Invoice Lifecycle") are far more useful than abstract ones ("State Machine").
## EXPLORATION SCOPE
Codebase: ___ files, ~___ LOC
Domain: [what this software does in 1 sentence]
Subsystems: [list from directory structure]
Progress tracking:
- Subsystems explored: 0/___
- Files read: 0
DO NOT proceed until you understand the domain and have listed subsystems.
You MUST complete discovery before recording any engrams. This prevents shallow coverage.
# Directory structure (subsystems)
find . -type d -not -path '*/\.*' -not -path '*/_build/*' -not -path '*/deps/*' -not -path '*/node_modules/*' -not -path '*/vendor/*' | head -50
# All source files grouped by directory
find . -type f -name "*.EXT" -not -path "*/_build/*" -not -path "*/deps/*" | sort | head -100Create a subsystem checklist with file counts:
## SUBSYSTEM CHECKLIST
- [ ] <dir1>/ - ___ files (describe purpose after reading)
- [ ] <dir2>/ - ___ files
- [ ] <dir3>/ - ___ files
- [ ] <dir4>/ - ___ files
...
Find and read (in order):
- README/docs - Project overview
- Main entry point - (application.ex, main.c, index.js, lib.rs, etc.)
- Public API surface - Exported functions/types
- Configuration - Build files, config schemas
# Universal patterns (always search)
grep -rn "TODO\|FIXME\|XXX\|HACK\|BUG\|NOTE\|WARNING\|CAREFUL" \
--include="*.EXT" -not -path "*/_build/*" -not -path "*/deps/*" | wc -lLanguage-specific patterns to search:
| Language | Patterns |
|---|---|
| Elixir | raise, rescue, # TODO, @doc false, defp.*# |
| Python | raise, except, # type:, # TODO, assert |
| JavaScript/TypeScript | throw, catch, @deprecated, // @ts-, console.warn |
| Go | panic, // +build, //go:, TODO, FIXME |
| Rust | unsafe, #[cfg(, todo!, unimplemented!, panic! |
| C/C++ | #ifdef, #if defined, assert(, // TODO, /* WARNING |
| Zig | @panic, unreachable, // TODO, @compileError |
Record the count—this estimates how many gotchas you should find.
DO NOT record engrams yet. Move to Phase 2.
Extract concepts at the BEHAVIOR level, not the MODULE level.
- Wrong (too broad): One concept summarizing everything a module does
- Right (behavior-level): Separate concepts for each distinct behavior, mechanism, or API
When code does multiple distinct things, create separate concepts for each:
- Different functions serving different purposes = separate concepts
- Different code paths handling different cases = separate behaviors
- Distinct configuration options that change behavior = separate concepts
- When code provides two+ ways to do something (e.g., sync vs async), create SEPARATE concepts with
contrasts_withlinks
Ask: "How many distinct behaviors or mechanisms does this code contain?" Extract that many.
FOR each subsystem directory, iterate through ALL source files:
- LIST all source files in the subsystem
- READ each file (at least first 300 lines for large files)
- EXTRACT concepts at the behavior level from each file
- GREP for warning patterns in that directory specifically
- READ corresponding test file(s) if they exist
- RECORD engrams before moving to next subsystem
Do not skip files. Coverage gaps from skipped files are the #1 source of missing knowledge. If a subsystem has 15 source files, read all 15.
The best time to identify relationships is WHILE you're reading the code. You have full context — you can see that function A calls function B, that config C controls behavior D, that module E imports module F. This context is lost once you move to the next file.
When you extract multiple concepts from the same file or adjacent files in a subsystem:
- Identify which concepts depend on each other — Does one call, configure, or enable the other?
- Use
chain_toorassociationsincog_remember— Capture the relationship in the same call that creates the concept - Be specific about WHY they're related — Not "they're in the same file" but "A calls B to validate input before processing"
This is your highest-confidence association signal. Relationships discovered while reading actual code are far more reliable than relationships inferred later from shared keywords. Most of your graph's associations should come from this step, not from Phase 5.
Tag each concept with one of these categories:
| Category | What it captures |
|---|---|
core-algorithm |
Main logic, algorithms, control flow |
data-structure |
Data structures and their operations |
api-surface |
Public interfaces, exports, entry points |
configuration |
Feature flags, settings, constants |
optimization |
Performance optimizations, caching, memoization |
error-handling |
Error management, recovery, validation |
state-management |
State tracking, transitions, lifecycle |
design-decision |
Why a particular approach was chosen, tradeoffs |
architecture |
System design, module boundaries, separation of concerns |
gotcha |
Non-obvious behavior, pitfalls, edge cases |
Include the category in your definition or as a keyword to aid later recall.
For EVERY engram, include:
- term - Descriptive name (2-5 words), qualified by subsystem context
- definition - What it is + how it works + WHY it exists (1-3 sentences). Include specific API details.
- keywords - 3-7 domain terms for semantic matching
- long_term - Always
trueduring bootstrap
Always qualify terms with their subsystem context. Generic terms collide across subsystems and destroy recall precision.
| Bad (will collide) | Good (subsystem-qualified) |
|---|---|
| "State Management" | "Channel State Management" |
| "Error Handling" | "Billing Payment Error Recovery" |
| "Configuration" | "Auth Token Configuration" |
| "Validation" | "Invoice Line Item Validation Rules" |
Before creating any term, ask: "Would another subsystem produce a concept with this exact same name?" If yes, make it more specific.
- When a module defines public functions, NAME them in the definition (e.g., "defines
handle_in/3,handle_out/3, andterminate/2callbacks" not just "defines several callbacks") - When code emits or handles named events, include the exact event names
- When a function accepts specific options or configuration keys, mention the important ones
- Prefer concrete details: "returns a
{:ok, socket}or{:error, reason}tuple" is better than "returns status information"
- Look for comments that explain WHY code is written a certain way
- Multi-line comments often contain architectural reasoning
- Comments with "because", "instead of", "tradeoff", "workaround", "note:" signal design decisions
- If a comment explains why an alternative was rejected, that's a
design-decisionconcept - Capture rules and constraints (e.g., "X must always be called before Y")
Example engrams with full metadata:
🧠 Recording to Cog...
cog_remember({
"term": "Why Billing Is a Separate Context",
"definition": "Billing was extracted from Accounts after coupling caused invoice
bugs. The separation enforces that billing logic never directly queries user
data—it receives what it needs via function parameters. Uses Billing.create_invoice/2
and Billing.process_payment/3 as the only public entry points.",
"keywords": ["billing", "accounts", "context boundary", "coupling", "invoice", "separation"],
"long_term": true,
"chain_to": [
{"term": "Billing Payment Error Recovery", "definition": "Payment failures use a 3-retry
strategy with exponential backoff via Oban jobs. After 3 failures, the invoice is marked
:payment_failed and an admin notification is sent.", "predicate": "enables"},
{"term": "Invoice Line Item Validation Rules", "definition": "Line items must have positive
amounts and valid tax codes. Validated in Billing.LineItem changeset, not at the controller
level.", "predicate": "contains"}
]
})
After exploring each subsystem, mark it complete:
## SUBSYSTEM CHECKLIST
- [x] lib/accounts/ - 8 files, 12 engrams (auth, permissions, user lifecycle)
- [x] lib/billing/ - 15 files, 22 engrams (payments, invoices, subscriptions)
- [ ] lib/messaging/ - 6 files
...
After subsystem exploration, explicitly search for patterns that span multiple subsystems.
| Concern | How to Find |
|---|---|
| Error handling | grep for error/exception patterns |
| Configuration | Config files, env vars, constants |
| Logging/observability | Log statements, metrics |
| Security | Auth, crypto, sanitization |
| Performance | Caching, pooling, batching |
| Concurrency | Locks, async, threads, channels |
| External integrations | API calls, DB queries, file I/O |
For each concern found:
- Understand the pattern used
- Record at least 1 engram explaining the approach
- Note any gotchas specific to this concern
# Find platform-specific or conditional code
grep -rn "#if\|#ifdef\|#elif\|cfg\[target\|process\.platform\|GOOS\|sys\.platform\|@tag" \
--include="*.EXT" -not -path "*/_build/*" -not -path "*/deps/*"If conditional code exists, you MUST explore each variant. Platform-specific bugs are hard to catch later.
Pick a core feature and trace it completely:
- Entry point (route/controller)
- Business logic layer
- Data access
- Side effects (jobs, notifications, etc.)
- Response formatting
Record insights about:
- How layers communicate
- Where validation happens
- How errors propagate
- What gets logged/monitored
Review the engrams you've created so far. Look for concepts across different subsystems that genuinely interact — shared keywords can be a signal, but you must verify the relationship is real before linking.
For each candidate pair of subsystems:
- Identify the signal — shared keywords, shared imports, or concepts that reference each other's domain
- Read the definitions — Do these concepts actually describe things that interact in the code? Or do they just happen to share a word?
- Verify in code — Can you point to specific code where subsystem A calls, configures, or depends on subsystem B?
- Create a bridging engram — Only if the relationship is real, describe HOW the pattern manifests in BOTH subsystems
- Link to existing concepts from both sides — Use
associationsto connect the bridge to specific concepts in each subsystem
The validation test: "If I removed subsystem A's concept, would subsystem B's concept break or behave differently?" If yes, the bridge is real. If no, it's just a keyword coincidence.
Example:
🧠 Recording to Cog...
cog_remember({
"term": "Shared Validation Pattern Across Contexts",
"definition": "Both Billing and Accounts use changeset-based validation with
custom validators in a shared Validators module. Billing adds monetary format
validation; Accounts adds email uniqueness. Both follow the rule that validation
happens in the context, never in controllers.",
"keywords": ["validation", "changeset", "billing", "accounts", "shared pattern"],
"long_term": true,
"associations": [
{"target": "Invoice Line Item Validation Rules", "predicate": "similar_to"},
{"target": "Account Email Uniqueness Check", "predicate": "similar_to"}
]
})
Actively search for non-obvious behavior.
# Find warnings and notes
grep -rn "TODO\|FIXME\|HACK\|NOTE\|WARNING\|XXX\|CAREFUL\|WORKAROUND" \
--include="*.EXT" -not -path "*/_build/*" -not -path "*/deps/*"For EACH result:
- Read the context (surrounding 10 lines)
- Determine if it's a gotcha worth recording
- If yes, create an engram with category
gotcha
# Find skipped or pending tests
grep -rn "skip\|pending\|@tag :skip\|xit\|xdescribe\|test.skip" \
--include="*_test.*" --include="*.test.*" --include="*_spec.*"Skipped tests often document known issues or edge cases.
- Race conditions or timing issues
- Order-dependent operations
- Implicit dependencies
- Cache invalidation triggers
- State machine edge cases
- Migration gotchas
- Environment-specific behavior
- Third-party API quirks
- Performance cliffs
- "Everyone knows" facts that aren't written down
Example gotcha engrams:
🧠 Recording to Cog...
cog_remember({
"term": "current_scope vs current_account Gotcha",
"definition": "Always use current_scope.account, not current_account. The
latter doesn't exist and fails silently. Common mistake for new developers.",
"keywords": ["current_scope", "current_account", "authentication", "gotcha", "silent failure"],
"long_term": true
})
cog_remember({
"term": "Stripe Webhook Idempotency Requirement",
"definition": "Stripe may send the same webhook multiple times. Always check
if the event was already processed using the event ID in WebhookHandler.process/1
before taking action. Failure to do so caused duplicate charges in March 2024.",
"keywords": ["stripe", "webhook", "idempotency", "duplicate", "event_id", "billing"],
"long_term": true
})
The value is in the CONNECTIONS. But false connections are worse than missing ones — they pollute spreading activation and produce irrelevant recall results.
Most associations should already exist from chain_to and associations you created during Phase 2 while reading code. Phase 5 fills gaps using three layers, ordered from highest to lowest confidence.
Every association must pass the usefulness test: "If someone recalled concept A, would concept B actually help them understand or work with A?"
Shared keywords alone are NOT sufficient justification for a link. Two concepts sharing the keyword "state" doesn't mean they're related — "Channel State Management" and "Build State Caching" may serve completely different purposes in completely different subsystems.
Review your key concepts with cog_connections to verify Phase 2 captured the relationships you saw in code:
- Check a few central concepts — are they linked to the concepts you know interact with them?
- Identify relationships you KNOW exist (because you saw the code) but forgot to link
- Add missing code-context associations with
cog_associate
Look for relationships implied by code structure that you may not have captured during per-file extraction:
- Imports/uses: If module A imports module B, concepts from A and B likely have real
requiresorenablesrelationships - Cross-subsystem calls: If function A calls function B across subsystems, those behaviors should be linked
- Shared configuration: If a configuration value controls behavior in multiple places, link the config concept to each behavior it affects
These are structural facts visible in code — they don't require judgment calls.
For concepts that MIGHT be related but weren't linked during code reading:
- Identify candidates — concepts sharing keywords or categories may be worth investigating
- Read both definitions carefully — do the definitions describe things that genuinely interact, depend on each other, or represent alternatives?
- Apply the validation test — "If I followed this link during recall, would the target actually help me understand or work with the source?"
- Articulate the specific relationship — you must be able to state WHY they're related, not just that they share a word
- Choose a specific predicate — if you can't pick a predicate more specific than
related_to, the relationship probably doesn't exist
❌ FALSE POSITIVE (keyword coincidence): "Channel State Management" and "Build State Caching" both have keyword "state" — but one manages WebSocket session data and the other caches compilation artifacts. Linking them pollutes recall for both.
✅ TRUE POSITIVE (semantic relationship):
"Channel State Management" requires "Socket Connection Lifecycle" — because channel state is initialized during socket connection and cleaned up on disconnect. The definition of one references the mechanism described in the other.
| Predicate | Use for |
|---|---|
enables |
A makes B possible (use in chains) |
requires |
A needs B to function (use in chains) |
implies |
Logical consequences (use in chains) |
leads_to |
Workflows, cause-effect (use in chains) |
contradicts |
Mutually exclusive options |
is_component_of |
Part-whole relationships |
contains |
Composition |
example_of |
Pattern instances |
generalizes |
Abstractions |
similar_to |
Related approaches |
contrasts_with |
Alternative approaches, different ways to do the same thing |
supersedes |
Deprecated patterns |
derived_from |
Origins, influences |
- Every concept → its parent domain/subsystem
- Every pattern → concepts that implement it
- Every gotcha → concepts it affects
- Related concepts → each other
- Prerequisites → dependent concepts
- Alternative approaches → each other via
contrasts_with
A super-hub is an engram so generic that it connects to nearly everything. These destroy the value of your knowledge graph.
Why super-hubs are harmful:
- Short-circuit paths:
cog_traceroutes through the hub, hiding meaningful relationships - Pollute spreading activation: Everything activates through the hub, drowning out relevant results
- Zero discriminative power: If everything connects to "Overview", knowing something connects to it tells you nothing
During bootstrap, you may be tempted to create a "Project Overview" or "Architecture" engram and connect everything to it. DO NOT DO THIS.
| Bad Term | Why It's Bad |
|---|---|
| "Project Overview" | Everything is part of the project |
| "Architecture" | Too broad — which aspect? |
| "Codebase Structure" | Every file relates to this |
| "System Design" | Generic category, not insight |
| "Main Components" | Container without specific value |
| "How It Works" | Describes everything and nothing |
| "Tech Stack" | Just a list, not insight |
| Bad (Generic) | Good (Specific) |
|---|---|
| "Architecture" | "Why We Use Event Sourcing for Payments" |
| "Project Overview" | "Core Domain: Multi-tenant SaaS Billing" |
| "Codebase Structure" | "Phoenix Context Boundary Rules" |
| "System Design" | "Why Async Job Queue Over Inline Processing" |
| "Tech Stack" | "Why PostgreSQL Over MySQL for JSONB" |
Before creating an engram, ask: "Could I reasonably connect this to more than 5-10 other engrams?"
If yes → It's too generic. Break it into specific insights instead. If no → Good. It captures specific knowledge.
Instead of one overview engram, capture specific insights:
❌ WRONG:
Term: "Project Architecture"
Definition: "This is a Phoenix app with contexts for accounts, billing, and messaging..."
→ Now you'll connect accounts, billing, messaging, Phoenix patterns, etc. to this one hub
✅ RIGHT:
Term: "Why Billing Is a Separate Context"
Definition: "Billing was extracted from Accounts after coupling caused invoice bugs..."
Term: "Messaging Depends on Accounts Context"
Definition: "Messaging requires account lookup for permissions. Direct Repo calls were removed..."
Term: "Phoenix Context Boundary Enforcement"
Definition: "Contexts never call each other's Repo directly. Use public functions only..."
Each specific engram connects to 2-5 related concepts, forming a rich graph without a single chokepoint.
Think: "Would this save someone time?"
- "Why Billing Uses a Separate Context" (decision rationale)
- "Invoice State Transition Rules" (domain knowledge)
- "Stripe Webhook Idempotency Requirement" (integration gotcha)
- "Ecto Preload N+1 in Account Queries" (tactical knowledge)
- "Auth Token Refresh Race Condition" (gotcha with context)
- "User Module" (just a file name)
- "Error Handling" (too generic — which subsystem? what errors?)
- "lib/myapp/accounts" (directory listing)
- "State Management" (will collide with every subsystem)
- "Configuration" (meaningless without context)
- term: 2-5 words, qualified by subsystem/domain context
- definition: What + how + WHY (1-3 sentences). Name specific functions, events, options.
- keywords: 3-7 domain terms for semantic recall matching
- long_term:
true(during bootstrap) - chain_to or associations: At least one connection (NO ORPHANS)
You are NOT done until verification passes.
For each subsystem in your checklist:
cog_recall("<subsystem name>")
# Verify engrams exist for each subsystem
Every subsystem MUST have engrams. Subsystems with 0 engrams are unacceptable.
Review your subsystem checklist. For each subsystem:
- How many source files does it have?
- How many did you actually read?
- Any large files (100+ lines) you skipped?
Large unread files are the #1 source of knowledge gaps. Go back and read them.
List any:
- Subsystems with 0 engrams → MUST FIX
- Knowledge categories with no coverage → MUST FIX
- Large files never read → MUST READ or justify
- Warning patterns found but not mined → Justify or fix
- Engrams without keywords → MUST ADD keywords
cog_connections("<main concept>")
# Verify rich interconnections exist
# Isolated engrams indicate missing relationships
Check for:
- Orphaned engrams (no connections) → Add connections
- Super-hubs (>10 connections) → Break into specific concepts
- Weak connections (all
related_to) → Use specific predicates - Missing
contrasts_withlinks between alternative approaches
| Anti-Pattern | Why It's Bad | Instead Do |
|---|---|---|
| Recording before exploring | Creates shallow coverage | Complete Phase 1 first |
| Exploring without recording | Loses insights discovered | Record per-subsystem |
| Breadth without depth | High-level only, misses gotchas | Read ALL files per subsystem |
| Depth without breadth | Deep in one area, blind spots elsewhere | Complete subsystem checklist |
| Skipping files in a subsystem | Coverage gaps → missing knowledge | Iterate through every source file |
| Skipping tests | Missing edge case documentation | Read test files |
| Skipping conditional code | Platform bugs will bite later | Explore each variant |
| Stopping at "good enough" | Leaves gaps | Cover every subsystem and file |
| Creating overview engrams | Super-hub pollution | Specific insights only |
| Generic terms without context | Collide across subsystems | Always qualify with subsystem name |
| Definitions without API details | Vague, ungrepable knowledge | Name specific functions, events, options |
| Engrams without keywords | Invisible to cog_recall |
Always include 3-7 keywords |
| Module-level extraction | Too coarse, misses behaviors | Extract at behavior level |
| Keyword-only associations | False positives pollute graph, degrade recall | Validate semantically: read definitions, articulate WHY |
| Deferring all linking to Phase 5 | Loses code context, lower quality | Link during Phase 2 while reading code |
You are NOT done until ALL are true:
Phase 0: Sizing
- Codebase size determined (files, LOC)
- Domain identified (problem, users, core concepts)
- All subsystems listed with file counts
Phase 1: Discovery
- All subsystems enumerated in checklist
- Entry points identified
- Warning pattern count known
Phase 2: Subsystems
- ALL subsystems in checklist explored
- ALL source files in each subsystem read (not just a sample)
- Concepts extracted at behavior level (not module level)
- Intra-file relationships captured via chain_to/associations during extraction
- Every engram has keywords (3-7 terms)
- Every term is qualified by subsystem context (no generic names)
- Strategic + tactical knowledge per subsystem
Phase 3: Cross-Cutting
- Error handling patterns documented
- Configuration approach documented
- At least one request traced end-to-end
- Conditional/platform code explored (if exists)
- Cross-subsystem bridges identified and recorded
Phase 4: Gotchas
- Warning patterns mined (TODO/FIXME/etc.)
- Test files checked for skipped tests
- Gotchas tagged with category
gotcha
Phase 5: Graph
- Code-context associations verified (from Phase 2 chain_to/associations)
- Structural dependencies traced (imports, cross-subsystem calls)
- Candidate associations semantically validated (definitions read, WHY articulated)
- Every concept connected (NO orphans)
-
contrasts_withlinks between alternative approaches - NO super-hub engrams (no single engram connects to >10 others)
- Connections use specific predicates (not all
related_to) - No keyword-only associations (every link has a stated reason beyond shared words)
Phase 6: Verification
- Every subsystem has engrams
- No large files skipped without justification
- Gap analysis shows no unjustified gaps
- Connection density verified
## Exploration Complete
### Codebase Stats
- Files: ___
- Lines of code: ~___
- Subsystems: ___
### Coverage
| Subsystem | Files | Files Read | Engrams |
|-----------|-------|------------|---------|
| [name] | ___ | ___ | ___ |
| ... | | | |
### Domain Summary
[2-3 sentences: what this system does]
### Key Decisions Captured
[Top 5 "why" insights that would be hard to recover]
### Critical Gotchas
[List the non-obvious behaviors that would bite new developers]
### Domain Knowledge
[Business rules and context that wasn't documented elsewhere]
### Knowledge Graph Stats
- Total engrams: ___
- Engrams with keywords: ___ (should be 100%)
- Total connections: ___
- Average connections per concept: ___
- Max connections on any single engram: ___ (should be <10)
- Orphaned engrams: ___ (should be 0)
### Knowledge Gaps
[Things you couldn't determine that may need human clarification]
Remember: You're building a knowledge asset that saves time—for new team members ramping up AND experienced developers debugging at 2am.