Skip to content

Instantly share code, notes, and snippets.

@zmanian
Last active February 5, 2026 18:38
Show Gist options
  • Select an option

  • Save zmanian/e5f5411d008603549cde0e74b25c5fc6 to your computer and use it in GitHub Desktop.

Select an option

Save zmanian/e5f5411d008603549cde0e74b25c5fc6 to your computer and use it in GitHub Desktop.
Comparison of Endo llm branch vs theoretical AI agent containment architecture

Endo llm Branch Analysis: Comparison with Theoretical AI Agent Architecture

Related: From Credential Proxy to Object-Capability Framework - How Endo generalizes the Deno sandbox pattern for AI agent systems

Executive Summary

The llm branch introduces @endo/chat, a web-based permission management UI for the Endo daemon. Contrary to what the branch name might suggest, this is not an LLM containment system. Instead, it provides a human-facing interface for managing capabilities through structured commands, with JavaScript evaluation as one of many features.

This contrasts significantly with the theoretical architecture document, which described a comprehensive LLM-specific containment model with compartment-per-tool isolation, attenuation chains, and sandboxed code execution.


What the llm Branch Actually Implements

Core Components

Component Purpose Implementation
Chat UI Web-based capability management packages/chat/src/chat.js (3700+ lines)
Gateway Server WebSocket bridge to daemon scripts/gateway-server.js
CapTP Connection Capability transport src/connection.js
Command Executor Structured command execution src/command-executor.js
Eval Form JavaScript evaluation with endowments src/eval-form.js

Architecture Flow

Browser (Vite App)
    │
    ├─► WebSocket Connection
    │       │
    │       ▼
    │   Gateway Server (localhost only)
    │       │
    │       ├─► CapTP Protocol
    │       │
    │       ▼
    │   Endo Daemon (Unix socket)
    │       │
    │       ▼
    │   Host Powers (AGENT capability)

Available Commands

The chat interface provides structured commands for:

  1. Messaging: request, dismiss, adopt, resolve, reject
  2. Execution: eval/js - JavaScript evaluation with pet name endowments
  3. Storage: list, show, remove, move, copy, mkdir
  4. Connections: invite, accept
  5. Workers: spawn
  6. Agents: mkhost, mkguest
  7. Bundles: mkbundle, mkplugin
  8. System: cancel, help

Comparison with Theoretical Architecture

1. Compartment-per-Tool Isolation

Theoretical Design llm Branch Reality
Each LLM tool gets its own Compartment No tool compartmentalization
Dynamic tool instantiation creates isolated contexts Single user session with host powers
Tools cannot access each other's state N/A - not tool-oriented

Gap Assessment: The theoretical model assumed LLM-generated code would be executed in per-tool compartments. The llm branch provides eval functionality, but there's no compartment-per-tool isolation - all evaluation happens within a single worker context with explicitly provided endowments.

2. Attenuation Chains

Theoretical Design llm Branch Reality
Progressive capability narrowing through delegation Flat capability model via host powers
Human User → Guest Agent → Tool Compartment chain Human → Web UI → Daemon (direct host access)
Each link receives attenuated subset No automatic attenuation

Gap Assessment: The theoretical architecture described nested attenuation where each layer (human, agent, tool) receives progressively narrower capabilities. The llm branch connects directly to host powers without automatic attenuation - users get whatever the AGENT capability provides.

3. Membrane Security

Theoretical Design llm Branch Reality
Membranes wrap all cross-boundary communication CapTP provides transport, not membranes
Custom membranes for logging/validation No custom membrane infrastructure
Selective capability revocation Manual cancel command available

Gap Assessment: While the theoretical architecture emphasized membranes as active security boundaries that intercept and validate all communication, the llm branch relies on CapTP for message transport without additional membrane layers.

4. LLM Sandboxing Model

Theoretical Design llm Branch Reality
LLM runs in sandboxed Compartment No LLM component
Only receives capabilities needed for current task N/A
Cannot execute arbitrary code directly Human executes eval, not LLM

Gap Assessment: The fundamental assumption of the theoretical architecture - that an LLM would be the entity executing code - is not present. The llm branch is a human-operated UI, not an LLM execution environment.

5. Guest/Host Power Differentiation

Theoretical Design llm Branch Reality
Clear Guest vs Host distinction Supports creating guests via mkguest
Guests lack: evaluate, makeUnconfined, makeBundle, provideWorker Chat UI connects as host
Automatic power reduction for untrusted agents No automatic guest creation for LLM

Partial Alignment: Both architectures recognize the Guest/Host distinction. However, the llm branch doesn't automatically run anything as a guest - it's a tool for human hosts to manage guests.


What Would Be Needed to Implement the Theoretical Architecture

To bridge the gap between the current llm branch and the theoretical LLM containment model:

1. LLM Integration Layer

Missing Component: LLM Agent as Guest

Needed:
├── LLM service integration (Ollama, OpenAI, etc.)
├── Automatic guest creation for LLM sessions
├── Message routing between human and LLM
└── Tool invocation interface

2. Compartment-per-Tool System

// Theoretical: Each tool in its own compartment
const toolCompartment = new Compartment({
  globals: { /* minimal globals */ },
  __options__: { /* SES options */ },
});

// Current: Single evaluation context
await E(powers).evaluate(workerName, source, codeNames, petNamePaths, resultPath);

3. Attenuation Chain Infrastructure

// Theoretical: Progressive narrowing
const llmGuest = await E(host).provideGuest('llm-agent', {
  agentName: 'llm-session',
  // Automatically attenuated powers
});

const toolPowers = await E(llmGuest).attenuateForTool('file-reader', {
  allowedPaths: ['/safe/directory'],
});

// Current: Direct host access
const host = await E(gatewayBootstrap).fetch(endoId);

4. Membrane Wrapper System

// Theoretical: Active security boundary
const membrane = makeMembrane({
  onAccess: (target, property) => logAccess(target, property),
  onInvoke: (target, args) => validateInvocation(target, args),
});

const wrappedCapability = membrane.wrap(capability);

Architectural Differences Summary

Aspect Theoretical Architecture llm Branch
Primary User LLM Agent Human via Web UI
Security Model Defense in depth via compartments Human judgment + daemon ACLs
Isolation Per-tool compartments Per-worker isolation
Capability Flow Automatic attenuation Manual endowment
Trust Boundary LLM as untrusted guest Human as trusted host
Code Execution LLM-generated, sandboxed Human-authored, evaluated

Conclusion

The llm branch and the theoretical architecture address different problems:

  • llm branch: Human-operated permission management UI with structured commands and JavaScript evaluation capabilities. It's a productivity tool for humans managing the Endo daemon.

  • Theoretical architecture: LLM containment system with automatic attenuation, compartment isolation, and membrane security. It's a safety framework for untrusted AI code execution.

The llm branch provides foundational infrastructure (CapTP connections, eval with endowments, guest/host management) that could serve as building blocks for the theoretical architecture, but significant additional work would be required to implement LLM-specific containment.

Key Insight

The llm branch name may be aspirational or a work-in-progress indicator. The current implementation is a polished human interface for Endo, not an LLM containment system. This represents valuable infrastructure but is architecturally distinct from the theoretical AI agent containment model.


Appendix: File Structure of llm Branch

packages/chat/
├── index.html                    # Vite entry point
├── vite.config.js               # Vite configuration
├── vite-endo-plugin.js          # Daemon management plugin
├── scripts/
│   └── gateway-server.js        # WebSocket gateway (251 lines)
├── src/
│   ├── main.js                  # App bootstrap (89 lines)
│   ├── chat.js                  # Main UI (3700+ lines)
│   ├── connection.js            # CapTP setup (166 lines)
│   ├── eval-form.js             # JS evaluation UI (407 lines)
│   ├── command-executor.js      # Command dispatch (229 lines)
│   ├── command-registry.js      # Command definitions (383 lines)
│   └── [14 additional modules]
└── package.json

Total additions vs master: ~11,000 lines of production code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment