created: 2026-01-22T20:44:36.255Z

type: principles tags: [principles, design, philosophy, draft]

Outfitter Principles

These aren’t rules we invented. They’re lessons we keep relearning—usually the hard way.

North Star

Build tools that make agents better at building software.

The Kit is how; Agent-First is why. Everything we ship should compose cleanly, fail explicitly, and leave a clear trail for whoever comes next.

The Laws

Non-negotiables. No exceptions, no “just this once.”

1. Plan First

You don’t set out without studying the map.

Before writing code—before writing tests—understand where you’re going and how you’ll know when you’ve arrived. A plan surfaces the rough edges before you trip on them. It forces you to distinguish between unknowns (things we can figure out) and unknowables (things we’ll only discover by doing). Both are fine. But know which is which before you start.

The plan isn’t a contract. It’s a hypothesis. The territory will surprise you, and the plan will change. But no plan means no hypothesis—and no hypothesis means you’re just wandering.

Minimum viable plan:

Destination: What does “done” look like?
Waypoints: What intermediate states will we pass through?
Hazards: Where might we get lost?

Plans can be lightweight. A few sentences in a commit message. A sketch on a whiteboard. The discipline is the thinking, not the document.

2. Tests First, Always

If you can’t write the test first, you don’t understand the requirement yet.

Red → Green → Refactor. The failing test is the spec. No implementation lands without a test that proves the behavior exists.

One caveat: tracer bullets and refactors get a pass on coverage while they’re proving things out. Sometimes wandering gets you to your destination with more learning along the way than the direct path would have. There’s always valid reasons to go off-trail—but you should be able to justify why. “I’m exploring the shape of this problem” is valid. “I forgot” isn’t.

The moment the pattern validates, harvest the tests before anything else. Exploration earns its keep by becoming specification.

3. Results, Not Exceptions

Handlers return Result<T, E>. They don’t throw.

Business logic uses the Result pattern. Exceptions are reserved for truly exceptional circumstances—programmer error, impossible states. This isn’t pedantry: it’s what makes code composable. It’s what makes code agent-friendly. Agents can’t reason about hidden control flow.

4. Small, Focused Modules

If you can’t explain what a package does in one sentence, it’s doing too much.

One responsibility. Compose small things into larger things. Boundaries are features, not overhead. When in doubt, split it out.

Small modules compound: they compose into larger systems, make duplication obvious, contain blast radius when things break, enable isolated code review, stack cleanly in branches, and give agents bounded problems they can reason about. You can’t get any of this from monoliths.

5. Dependencies Flow Down

Foundation → Runtime → Application. Never sideways within a tier. Never upward.

Cold packages don’t import warm packages. Dependency direction is architecture. Architecture is destiny.

6. One Source of Truth

If two things must stay in sync, one derives from the other. No exceptions.

Don’t maintain parallel definitions. Generate what can be generated. Derive what can be derived. The moment you have two files that “should match,” you’ve created a bug waiting to happen.

7. Redaction is a Contract

No secrets in logs, state, or history. Period.

If it might be sensitive, redact it. Assume every log line will be seen by an LLM. This isn’t paranoia—it’s the operating assumption of agent-first development.

8. Style Through Tokens

If it can look different, it will look different—unless you prevent it.

UI, TUI, CLI, MCP share a design language. Colors, icons, emphasis, spacing come from tokens. Adapters render tokens; they don’t invent styles. Consistency is codified, not documented.

9. Done Means Verified

Work isn’t done when the code is written. It’s done when it’s verified working.

Verification means: tests pass, types check, lints clean, behavior matches spec. If we went off-spec—and sometimes that’s the right call—we document why. The deviation becomes part of the record.

The feedback loop is the constraint, not generation speed. If validating a change takes two hours, it doesn’t matter that writing it took thirty seconds. Design for verifiability from the start.

10. Trailblazers Leave Paths

What one agent learns shouldn’t die with its session.

Handoffs are how we compound. Every session that does meaningful work should leave behind: what was done, what was learned, what’s left to do. The next agent—or human—picks up the trail instead of starting from scratch. The plan is the starting point. The trail log is the actual path taken.

Delegation requires definition. Before handing work to a subagent, you need two things: a clear start state (where should it begin reading?) and a clear done state (what artifact proves completion?). If you can’t answer both, you’re still brainstorming—not delegating.

Subagents create artifacts. Their traces survive even when the session is ephemeral. This is what makes delegation composable: the work product outlives the worker.

But artifacts accumulate. Context rot is real. Periodic cleanup is part of the workflow—archive what’s settled, prune what’s stale, keep the active context lean. The goal is trails you can follow, not forests you get lost in.

The Priorities

When tradeoffs arise, lean this way.

Well-Trodden Paths First

Before cutting a new trail, check if there’s already a path.

We have standards, best practices, blessed dependencies—paths that have earned their place. Every time we walk them, we build expertise. Muscle memory compounds. The tenth time you use a pattern, you’re faster than the first.

Novelty has costs: learning curves, undiscovered edge cases, integration surprises. The blessed list isn’t bureaucracy—it’s accumulated wisdom.

The blessed list captures what we’ve vetted: dependencies we trust, patterns that have proven out, conventions that reduce decisions, tools we’ve built expertise with.

When something’s not on the list, that’s not a “no”—it’s a “make the case.” Push back hard, but be persuadable. When the new path is better, it becomes blessed, and the next person benefits.

Bun Before npm

Before adding a dependency, check if Bun provides it natively.

Does Bun do this? → Use it.
Hit limitations? → Wrap the Bun API.
Still need external? → Check the blessed list.
Not blessed? → Push back hard.

Stable Contracts Over Clever Code

The interface is the commitment. Internals are negotiable.

Behavior encoded as tests matters more than elegant implementation. Stable APIs (cold) change rarely; implementation (warm) can evolve. Don’t confuse the two.

Automation Over Discipline

If a pattern is important enough to document, it’s important enough to automate.

Pre-commit hooks > code review comments. Lint rules > style guides. Guardrails block bad code from landing—they don’t catch it in review. Fail early, fail loud.

Explicit Over Implicit

Typed errors over string messages. Static composition over runtime discovery. Factory functions over DI frameworks. Named exports over re-exports.

When something can be implicit, it eventually will be implicit—and then nobody knows what’s actually happening.

Shared Handlers Over Duplicated Logic

CLI and MCP are thin adapters over transport-agnostic handlers.

Write the logic once, test it once, adapt it many times. Adapters format. Handlers compute.

The Rejections

Explicitly off the table. Each one has burned us before.

No Re-exports

Import paths should tell the truth about dependencies.

Why: Re-exports hide where code actually lives. When you import from a barrel file, you don’t know what you’re pulling in. Meta-packages become dumping grounds for “I don’t know where this belongs.” Tree-shaking suffers because bundlers can’t trace the real dependency graph. And when something breaks, the stack trace lies to you.

If a consumer needs something from a package, they import it directly. The import path is documentation.

No DI Frameworks

Packages export factory functions or classes. Consumers wire dependencies explicitly.

Why: DI frameworks trade local complexity for global mystery. You look at a class and can’t tell where its dependencies come from. Tests become dependent on framework magic. Debugging requires understanding the container’s resolution order. And the framework itself becomes a dependency that infects everything.

Explicit wiring is more typing. It’s also more understanding. When you can see the construction, you can reason about the system.

No Exceptions in Business Logic

Use Result<T, E>. Reserve exceptions for programmer error and impossible states.

Why: Exceptions break composition. A function that throws can’t be chained with .map() or .flatMap(). Control flow becomes invisible—you have to read the implementation to know what might throw. Error handling gets scattered across try-catch blocks instead of living in the type signature.

Typed errors make failure explicit. The compiler tells you when you haven’t handled a case. The code tells you what can go wrong without reading the implementation.

No Logic in Adapters

CLI commands and MCP tools are thin wrappers. The if statements live in handlers.

Why: When logic lives in adapters, you test the same behavior multiple times—once for CLI, once for MCP, once for the HTTP endpoint. When you find a bug, you fix it in multiple places. When you add a feature, you implement it multiple times.

Adapters translate transport to handler calls. They parse args, format output, handle transport-specific errors. That’s it. The interesting code lives in handlers, gets tested once, and works everywhere.

No Node Compatibility by Default

Bun-only unless compatibility materially matters.

Why: Compatibility has costs. Polyfills add weight. Conditional imports add complexity. Testing both runtimes doubles the matrix. And “just in case someone needs Node” isn’t a real requirement—it’s a guess.

Start with Bun. If a real user needs Node compatibility, that’s a conversation. Compatibility is a feature you ship when someone asks for it, not a default you assume.

Decision Frameworks

Mental models we use over and over.

The Temperature Model

Temp	Meaning	Examples
Cold	Stable, rarely changes, high compat bar	contracts, types, schemas
Warm	Active development, may evolve	runtime packages, handlers
Hot	Rapid iteration, expect breakage	experiments, spikes, prototypes

Surfacing temperature:

Temperature should be legible—to humans and agents alike:

Directory structure: /contracts/ is cold. /experiments/ is hot. The path tells you what you’re touching.
Change velocity: Untouched for six months with many dependents? Cold. Changed daily? Hot.
Dependency direction: Cold packages have many dependents; hot packages have few or none.

When you touch something cold, you move carefully—tests, migration paths, deprecation notices. When you touch something hot, you move fast and expect to throw it away.

Agents need explicit signals. “Don’t modify files in /contracts/ without approval” is enforceable. “Be careful with stable code” is not.

The DRY Trigger

Pattern appears once → leave it.
Pattern appears twice → note it.
Pattern appears three times → extract it.

The wrong abstraction is worse than duplication. When uncertain, tolerate duplication until the pattern clarifies.

Script or CLI?

If you’re asking the question, it’s probably a CLI.

Scripts are for things you’d delete without guilt.

Vocabulary

Terms with specific meanings in Outfitter.

Term	Meaning
Handler	Transport-agnostic domain logic returning `Result<T, E>`
Adapter	Thin layer converting transport to handler calls
Temperature	Change frequency classification (cold/warm/hot)
Blessed	Vetted dependency, pattern, or tool on the well-trodden path
Harvest	Extract proven behavior from existing code into tests
Tracer bullet	First end-to-end implementation validating a pattern
Shape	Semantic data structure decoupled from rendering
Token	Design system primitive (color, icon, emphasis, spacing)
Trail log	Record of what was done, learned, and left to do—the handoff artifact
Context rot	Accumulation of stale artifacts that obscure rather than illuminate
Start state	Where a delegated task should begin reading
Done state	The artifact that proves a delegated task is complete

Open Questions

Things we’re still figuring out:

Where do plans live? Spec-driven development says plans should be version-controlled artifacts in the repo. But what’s the right structure? A /specs/ directory? Inline with the code? Still exploring.
Documentation level: Types > comments > docs—but where’s the line? When does self-documenting code need prose?
Cleanup cadence: Trail logs and artifacts accumulate. What triggers cleanup? Time-based? Milestone-based? Manual review?
Blessed list mechanics: We know why we have one. Still need to nail down where it lives and how additions get vetted.

Next Steps

From here, derive:

Rules (.claude/rules/) — Enforceable guidelines for agents
Skills — Patterns for applying these principles
Pre-commit hooks — Automated enforcement
Lint rules — ast-grep patterns for the non-negotiables

galligan/202601221544-outfitter-principles.md

Select an option

No results found