Skip to content

Instantly share code, notes, and snippets.

@zmanian
Created February 9, 2026 18:12
Show Gist options
  • Select an option

  • Save zmanian/bf96a604bb39d3230f3daec59b7235f9 to your computer and use it in GitHub Desktop.

Select an option

Save zmanian/bf96a604bb39d3230f3daec59b7235f9 to your computer and use it in GitHub Desktop.

Ax + Endo: DSPy-style LLM Programs in an Object-Capability Runtime

Background

Ax (@ax-llm/ax)

Ax is a TypeScript DSPy implementation for building production LLM applications. Instead of writing prompts, you define signatures -- declarative input/output contracts -- and Ax handles prompt generation, structured output extraction, validation, retries, and optimization.

Key capabilities:

  • Signatures: 'review:string -> sentiment:class "positive, negative, neutral"' -- no prompt engineering
  • Chain-of-thought: Internal reasoning fields (reasoning!:string) that guide but aren't returned
  • Assertions: Output validation with automatic retry on failure
  • Optimizers: MiPRO, GEPA, ACE -- automatic prompt tuning from examples and metrics
  • 15+ providers: OpenAI, Anthropic, Google, Mistral, Ollama, etc. -- switch with one line
  • Tool calling: Native ReAct pattern with function routing
  • AxFlow: DAG-based workflow orchestration with automatic parallelization
  • MCP support: Model Context Protocol client for tool extensibility
  • Minimal deps: Only @opentelemetry/api and dayjs in production. Apache-2.0.

v16.1.6, published 2026-02-09. Actively maintained (daily releases).

Endo / Llamadrome

Endo is an object-capability runtime for JavaScript. The Endo daemon manages confined guest programs that communicate through durable messages. Guests receive only explicitly granted capabilities -- no ambient authority.

Llamadrome is an LLM agent running as an Endo guest. It receives messages, calls an LLM (Anthropic or Ollama), and proposes code for sandboxed evaluation. The host reviews each proposal and decides which capabilities to bind. Recent llm-long-running-tasks work adds conversation persistence, multi-step result chaining, task threading, and async approval recovery.


Why Ax for Endo

Llamadrome currently manages raw LLM API calls directly -- building message arrays, parsing tool call responses, looping until stop_reason !== 'tool_use'. This works but leaves several things on the table:

  1. Prompt engineering is manual -- the system prompt is a hand-written string. Ax generates optimal prompts from signatures.
  2. No structured output validation -- Llamadrome trusts the LLM to use tools correctly. Ax validates outputs against typed signatures and retries on failure.
  3. No optimization -- prompt quality is static. Ax's optimizers (MiPRO, GEPA) can tune prompts from examples and metrics automatically.
  4. Provider lock-in -- switching between Anthropic and Ollama requires separate backend implementations with different message formats. Ax abstracts this to a single interface.
  5. No workflow composition -- multi-step tasks are ad-hoc tool call chains. AxFlow provides DAG-based orchestration with dependency analysis and automatic parallelization.
  6. No chain-of-thought control -- internal reasoning fields (! suffix) let the LLM think through steps without exposing internals to the user.

Integration Architecture

Layer 1: Replace raw API calls with Ax programs

Replace anthropic.messages.create() / ollama.chat() with Ax signatures:

import { ai, ax } from '@ax-llm/ax';

// Current: hand-rolled Anthropic API call with tool loop
// Proposed: declarative signature

const respond = ax(`
  "You are Llamadrome, an AI assistant in an object-capability system."
  userMessage:string, availableNames:string[] ->
  reasoning!:string "Think through what the user needs",
  response:string "Text response to the user",
  needsCode:boolean "Whether code evaluation is needed"
`);

// Ax handles: prompt generation, structured extraction, validation, retries
const result = await respond.forward(llm, {
  userMessage: 'increment my counter',
  availableNames: await E(powers).list(),
});

The LLM provider becomes a constructor argument, not a code path:

// Switch providers with one line
const llm = ai({
  name: process.env.LLM_BACKEND || 'ollama',
  ...(process.env.LLM_BACKEND === 'anthropic'
    ? { apiKey: process.env.ANTHROPIC_API_KEY }
    : {}),
  config: { model: process.env.LLM_MODEL || 'qwen3' },
});

This eliminates the need for separate anthropic-backend.js and ollama-backend.js files entirely.

Layer 2: Endo tools as Ax functions

Ax has native function/tool calling. Map Endo capabilities to Ax functions:

import { agent } from '@ax-llm/ax';

const endoTools = [
  {
    name: 'define_code',
    description: 'Propose code with named capability slots for host review.',
    func: async ({ source, slots }) => {
      const result = await E(powers).define(source, slots);
      return JSON.stringify(result);
    },
    inputSchema: {
      type: 'object',
      properties: {
        source: { type: 'string' },
        slots: { type: 'object', additionalProperties: { type: 'object' } },
      },
      required: ['source', 'slots'],
    },
  },
  {
    name: 'store_result',
    func: async ({ name, value }) => {
      await E(powers).storeValue(value, name);
      return `Stored under "${name}"`;
    },
    // ...schema
  },
  {
    name: 'list_names',
    func: async () => JSON.stringify(await E(powers).list()),
  },
  {
    name: 'lookup_value',
    func: async ({ name }) => JSON.stringify(await E(powers).lookup(name)),
  },
  {
    name: 'chain_eval',
    func: async ({ source, bindings, resultName }) => {
      const codeNames = Object.keys(bindings);
      const petNamePaths = codeNames.map(k => bindings[k]);
      const result = await E(powers).requestEvaluation(
        source, codeNames, petNamePaths, resultName,
      );
      return JSON.stringify(result);
    },
  },
];

// Agent = signature + tools, with automatic ReAct loop
const llamadrome = agent(
  'userMessage:string -> response:string',
  { functions: endoTools }
);

Ax manages the tool call loop (currently ~40 lines of while/switch in anthropic-backend.js) automatically.

Layer 3: Assertions for security guardrails

Use Ax assertions to validate LLM outputs before they reach Endo:

const codeProposer = ax(
  'task:string -> source:string, slots:json'
);

// Reject code that tries to access globals
codeProposer.addAssert(({ source }) => {
  const forbidden = ['process', 'require', 'import', 'globalThis', 'window'];
  for (const word of forbidden) {
    if (source.includes(word)) {
      return `Code must not reference "${word}" -- use capability slots instead`;
    }
  }
  return true;
});

// Ensure slots are well-formed
codeProposer.addAssert(({ slots }) => {
  const parsed = JSON.parse(slots);
  for (const [key, val] of Object.entries(parsed)) {
    if (!val.label) return `Slot "${key}" must have a label`;
  }
  return true;
});

On assertion failure, Ax automatically retries with the error message as context -- the LLM sees what went wrong and corrects itself.

Layer 4: AxFlow for multi-step task orchestration

Replace ad-hoc store_result -> chain_eval chains with declarative workflows:

import { flow } from '@ax-llm/ax';

const multiStepTask = flow()
  .node('plan', 'task:string -> steps:string[]')
  .node('execute', 'step:string, context:json -> result:string, updatedContext:json')
  .node('summarize', 'results:string[] -> summary:string')
  .execute('plan', (state) => ({ task: state.userTask }))
  .forEach('execute', 'plan.steps', (step, state) => ({
    step,
    context: state.accumulatedContext || {},
  }))
  .execute('summarize', (state) => ({
    results: state.executeResults.map(r => r.result),
  }))
  .returns((state) => ({ summary: state.summarize.summary }));

AxFlow analyzes dependencies and parallelizes independent steps automatically.

Layer 5: Prompt optimization with MiPRO

Once the agent has run enough conversations, optimize prompts from real data:

import { AxMiPRO } from '@ax-llm/ax';

const optimizer = new AxMiPRO({
  metric: ({ prediction, expected }) => {
    // Score based on: did the host approve the code? Was the result correct?
    return prediction.response === expected.response ? 1 : 0;
  },
});

// Train on conversation history (already persisted via storeValue)
const optimized = await optimizer.compile(
  respond, // the signature program
  llm,
  trainingExamples, // from saved conversation state
);

// Save optimized program for future use
await E(powers).storeValue(optimized.export(), 'optimized-prompt');

What Changes in Llamadrome

Current With Ax
anthropic-backend.js (423 lines) Eliminated -- Ax provider abstraction
ollama-backend.js (113 lines) Eliminated -- same Ax program, different ai() config
Hand-written system prompt (141 lines) Ax generates from signatures; system context via description field
Manual tool call loop (~40 lines) Ax agent() handles ReAct automatically
No output validation Assertions with automatic retry
Static prompt quality MiPRO/GEPA optimization from conversation history
Ad-hoc multi-step chaining AxFlow DAG orchestration
Raw @anthropic-ai/sdk + ollama deps Single @ax-llm/ax dep (2 transitive deps)

Preserved from current architecture:

  • Conversation persistence via storeValue / loadConversation (Ax's AxMemory can serialize to this)
  • Async approval recovery for define_code (checkpoint before blocking tools stays the same)
  • Thread support (orthogonal to LLM layer)
  • The makeExo / help() interface for the guest module

Dependency Assessment

@anthropic-ai/sdk + ollama (current) @ax-llm/ax (proposed)
Production deps 2 packages, each with own transitive tree 1 package, 2 transitive deps (@opentelemetry/api, dayjs)
License Apache-2.0 + MIT Apache-2.0
Maintenance Both active Daily releases (v16.1.6 = 2026-02-09)
Provider coverage Anthropic + Ollama only 15+ providers via unified interface
Bundle Two separate SDKs One library

Ax replaces both @anthropic-ai/sdk and ollama as dependencies -- net reduction in dep surface.


SES/Hardened JS Compatibility Considerations

Ax uses standard ES module patterns and doesn't rely on:

  • eval() or new Function() (would be blocked by SES lockdown)
  • Global mutation (frozen globals in SES)
  • Node.js built-ins beyond fetch (available in Endo's compartment)

Potential friction points:

  • @opentelemetry/api: Uses global registration (globalThis). May need shimming or disabling under SES lockdown. Ax works without tracing enabled.
  • dayjs: Lightweight, mostly pure. Should work under SES with possible minor patching.
  • Streaming: Ax uses standard ReadableStream / async iterators. Compatible with Endo's eventual send patterns.
  • harden() on Ax objects: Ax returns plain objects/classes. The Endo integration layer would harden() the returned results before passing them through the capability boundary.

A compatibility spike would be needed to confirm Ax runs clean under lockdown(). The most likely issue is OpenTelemetry's global registration, which can be disabled.


Implementation Sketch

Phase 1: Provider unification

  • Add @ax-llm/ax dependency, remove @anthropic-ai/sdk and ollama
  • Create ax-backend.js replacing both backend files
  • Wire ai() provider from env vars

Phase 2: Signature-based programs

  • Convert system prompt to Ax signatures
  • Map Endo tools to Ax functions
  • Replace manual tool loop with agent()

Phase 3: Assertions and validation

  • Add security assertions (no global references in proposed code)
  • Add structural assertions (well-formed slots, valid pet names)

Phase 4: Workflow orchestration

  • Define multi-step task templates with AxFlow
  • Automatic parallelization of independent steps

Phase 5: Prompt optimization

  • Collect training data from conversation history
  • Run MiPRO optimization periodically
  • Persist optimized prompts via storeValue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment