Ax is a TypeScript DSPy implementation for building production LLM applications. Instead of writing prompts, you define signatures -- declarative input/output contracts -- and Ax handles prompt generation, structured output extraction, validation, retries, and optimization.
Key capabilities:
- Signatures:
'review:string -> sentiment:class "positive, negative, neutral"'-- no prompt engineering - Chain-of-thought: Internal reasoning fields (
reasoning!:string) that guide but aren't returned - Assertions: Output validation with automatic retry on failure
- Optimizers: MiPRO, GEPA, ACE -- automatic prompt tuning from examples and metrics
- 15+ providers: OpenAI, Anthropic, Google, Mistral, Ollama, etc. -- switch with one line
- Tool calling: Native ReAct pattern with function routing
- AxFlow: DAG-based workflow orchestration with automatic parallelization
- MCP support: Model Context Protocol client for tool extensibility
- Minimal deps: Only
@opentelemetry/apianddayjsin production. Apache-2.0.
v16.1.6, published 2026-02-09. Actively maintained (daily releases).
Endo is an object-capability runtime for JavaScript. The Endo daemon manages confined guest programs that communicate through durable messages. Guests receive only explicitly granted capabilities -- no ambient authority.
Llamadrome is an LLM agent running as an Endo guest. It receives messages, calls an LLM (Anthropic or Ollama), and proposes code for sandboxed evaluation. The host reviews each proposal and decides which capabilities to bind. Recent llm-long-running-tasks work adds conversation persistence, multi-step result chaining, task threading, and async approval recovery.
Llamadrome currently manages raw LLM API calls directly -- building message arrays, parsing tool call responses, looping until stop_reason !== 'tool_use'. This works but leaves several things on the table:
- Prompt engineering is manual -- the system prompt is a hand-written string. Ax generates optimal prompts from signatures.
- No structured output validation -- Llamadrome trusts the LLM to use tools correctly. Ax validates outputs against typed signatures and retries on failure.
- No optimization -- prompt quality is static. Ax's optimizers (MiPRO, GEPA) can tune prompts from examples and metrics automatically.
- Provider lock-in -- switching between Anthropic and Ollama requires separate backend implementations with different message formats. Ax abstracts this to a single interface.
- No workflow composition -- multi-step tasks are ad-hoc tool call chains. AxFlow provides DAG-based orchestration with dependency analysis and automatic parallelization.
- No chain-of-thought control -- internal reasoning fields (
!suffix) let the LLM think through steps without exposing internals to the user.
Replace anthropic.messages.create() / ollama.chat() with Ax signatures:
import { ai, ax } from '@ax-llm/ax';
// Current: hand-rolled Anthropic API call with tool loop
// Proposed: declarative signature
const respond = ax(`
"You are Llamadrome, an AI assistant in an object-capability system."
userMessage:string, availableNames:string[] ->
reasoning!:string "Think through what the user needs",
response:string "Text response to the user",
needsCode:boolean "Whether code evaluation is needed"
`);
// Ax handles: prompt generation, structured extraction, validation, retries
const result = await respond.forward(llm, {
userMessage: 'increment my counter',
availableNames: await E(powers).list(),
});The LLM provider becomes a constructor argument, not a code path:
// Switch providers with one line
const llm = ai({
name: process.env.LLM_BACKEND || 'ollama',
...(process.env.LLM_BACKEND === 'anthropic'
? { apiKey: process.env.ANTHROPIC_API_KEY }
: {}),
config: { model: process.env.LLM_MODEL || 'qwen3' },
});This eliminates the need for separate anthropic-backend.js and ollama-backend.js files entirely.
Ax has native function/tool calling. Map Endo capabilities to Ax functions:
import { agent } from '@ax-llm/ax';
const endoTools = [
{
name: 'define_code',
description: 'Propose code with named capability slots for host review.',
func: async ({ source, slots }) => {
const result = await E(powers).define(source, slots);
return JSON.stringify(result);
},
inputSchema: {
type: 'object',
properties: {
source: { type: 'string' },
slots: { type: 'object', additionalProperties: { type: 'object' } },
},
required: ['source', 'slots'],
},
},
{
name: 'store_result',
func: async ({ name, value }) => {
await E(powers).storeValue(value, name);
return `Stored under "${name}"`;
},
// ...schema
},
{
name: 'list_names',
func: async () => JSON.stringify(await E(powers).list()),
},
{
name: 'lookup_value',
func: async ({ name }) => JSON.stringify(await E(powers).lookup(name)),
},
{
name: 'chain_eval',
func: async ({ source, bindings, resultName }) => {
const codeNames = Object.keys(bindings);
const petNamePaths = codeNames.map(k => bindings[k]);
const result = await E(powers).requestEvaluation(
source, codeNames, petNamePaths, resultName,
);
return JSON.stringify(result);
},
},
];
// Agent = signature + tools, with automatic ReAct loop
const llamadrome = agent(
'userMessage:string -> response:string',
{ functions: endoTools }
);Ax manages the tool call loop (currently ~40 lines of while/switch in anthropic-backend.js) automatically.
Use Ax assertions to validate LLM outputs before they reach Endo:
const codeProposer = ax(
'task:string -> source:string, slots:json'
);
// Reject code that tries to access globals
codeProposer.addAssert(({ source }) => {
const forbidden = ['process', 'require', 'import', 'globalThis', 'window'];
for (const word of forbidden) {
if (source.includes(word)) {
return `Code must not reference "${word}" -- use capability slots instead`;
}
}
return true;
});
// Ensure slots are well-formed
codeProposer.addAssert(({ slots }) => {
const parsed = JSON.parse(slots);
for (const [key, val] of Object.entries(parsed)) {
if (!val.label) return `Slot "${key}" must have a label`;
}
return true;
});On assertion failure, Ax automatically retries with the error message as context -- the LLM sees what went wrong and corrects itself.
Replace ad-hoc store_result -> chain_eval chains with declarative workflows:
import { flow } from '@ax-llm/ax';
const multiStepTask = flow()
.node('plan', 'task:string -> steps:string[]')
.node('execute', 'step:string, context:json -> result:string, updatedContext:json')
.node('summarize', 'results:string[] -> summary:string')
.execute('plan', (state) => ({ task: state.userTask }))
.forEach('execute', 'plan.steps', (step, state) => ({
step,
context: state.accumulatedContext || {},
}))
.execute('summarize', (state) => ({
results: state.executeResults.map(r => r.result),
}))
.returns((state) => ({ summary: state.summarize.summary }));AxFlow analyzes dependencies and parallelizes independent steps automatically.
Once the agent has run enough conversations, optimize prompts from real data:
import { AxMiPRO } from '@ax-llm/ax';
const optimizer = new AxMiPRO({
metric: ({ prediction, expected }) => {
// Score based on: did the host approve the code? Was the result correct?
return prediction.response === expected.response ? 1 : 0;
},
});
// Train on conversation history (already persisted via storeValue)
const optimized = await optimizer.compile(
respond, // the signature program
llm,
trainingExamples, // from saved conversation state
);
// Save optimized program for future use
await E(powers).storeValue(optimized.export(), 'optimized-prompt');| Current | With Ax |
|---|---|
anthropic-backend.js (423 lines) |
Eliminated -- Ax provider abstraction |
ollama-backend.js (113 lines) |
Eliminated -- same Ax program, different ai() config |
| Hand-written system prompt (141 lines) | Ax generates from signatures; system context via description field |
| Manual tool call loop (~40 lines) | Ax agent() handles ReAct automatically |
| No output validation | Assertions with automatic retry |
| Static prompt quality | MiPRO/GEPA optimization from conversation history |
| Ad-hoc multi-step chaining | AxFlow DAG orchestration |
Raw @anthropic-ai/sdk + ollama deps |
Single @ax-llm/ax dep (2 transitive deps) |
- Conversation persistence via
storeValue/loadConversation(Ax'sAxMemorycan serialize to this) - Async approval recovery for
define_code(checkpoint before blocking tools stays the same) - Thread support (orthogonal to LLM layer)
- The
makeExo/help()interface for the guest module
@anthropic-ai/sdk + ollama (current) |
@ax-llm/ax (proposed) |
|
|---|---|---|
| Production deps | 2 packages, each with own transitive tree | 1 package, 2 transitive deps (@opentelemetry/api, dayjs) |
| License | Apache-2.0 + MIT | Apache-2.0 |
| Maintenance | Both active | Daily releases (v16.1.6 = 2026-02-09) |
| Provider coverage | Anthropic + Ollama only | 15+ providers via unified interface |
| Bundle | Two separate SDKs | One library |
Ax replaces both @anthropic-ai/sdk and ollama as dependencies -- net reduction in dep surface.
Ax uses standard ES module patterns and doesn't rely on:
eval()ornew Function()(would be blocked by SES lockdown)- Global mutation (frozen globals in SES)
- Node.js built-ins beyond
fetch(available in Endo's compartment)
Potential friction points:
@opentelemetry/api: Uses global registration (globalThis). May need shimming or disabling under SES lockdown. Ax works without tracing enabled.dayjs: Lightweight, mostly pure. Should work under SES with possible minor patching.- Streaming: Ax uses standard
ReadableStream/ async iterators. Compatible with Endo's eventual send patterns. harden()on Ax objects: Ax returns plain objects/classes. The Endo integration layer wouldharden()the returned results before passing them through the capability boundary.
A compatibility spike would be needed to confirm Ax runs clean under lockdown(). The most likely issue is OpenTelemetry's global registration, which can be disabled.
- Add
@ax-llm/axdependency, remove@anthropic-ai/sdkandollama - Create
ax-backend.jsreplacing both backend files - Wire
ai()provider from env vars
- Convert system prompt to Ax signatures
- Map Endo tools to Ax functions
- Replace manual tool loop with
agent()
- Add security assertions (no global references in proposed code)
- Add structural assertions (well-formed slots, valid pet names)
- Define multi-step task templates with AxFlow
- Automatic parallelization of independent steps
- Collect training data from conversation history
- Run MiPRO optimization periodically
- Persist optimized prompts via
storeValue