Implements the Recursive Language Model pattern using native Jido/BEAM semantics. No regex parsing. No Python subprocess. No
exec().Supersedes RLM_STRATEGY.md. See RLM_STRATEGY_CRITIQUE.md for rationale.
The RLM paper's contribution is a methodology — iterative context exploration with sub-LLM delegation. The Python REPL is an implementation detail. Every RLM primitive maps cleanly to an existing BEAM/Jido concept:
| RLM (Python) | BEAM/Jido Native |
|---|---|
| REPL persistent variables | Workspace state — ETS table keyed by {request_id, :workspace} |
exec() code blocks |
Typed tool calls — Jido Actions via Directive.ToolExec |
llm_query() sub-calls |
llm.subquery_batch Action — Task.async_stream under agent's TaskSupervisor |
| Regex response parsing | API-native tool_calls — ReqLLM extracts structured ToolCall structs |
FINAL() marker |
Standard final answer — ReAct machine's :final_answer type |
| Iteration loop | ReAct FSM — awaiting_llm ↔ awaiting_tool cycle |
| Context in temp file | ContextStore — ETS or process state, accessed by reference |
RLM is implemented as a strategy adapter that composes with the existing ReAct.Machine, providing RLM-specific tools, prompts, and context/workspace management. No machine fork needed.
┌──────────────────────────────────────────────────────────┐
│ Jido.AI.Strategies.RLM (Strategy adapter) │
│ │ │
│ ├── Jido.AI.ReAct.Machine (reused, unmodified) │
│ │ └── idle → awaiting_llm → awaiting_tool → ... │
│ │ │
│ ├── Jido.AI.RLM.Prompts │
│ │ ├── system_prompt/1 (exploration methodology) │
│ │ └── next_step_prompt/1 (per-iteration guidance) │
│ │ │
│ ├── Jido.AI.RLM.ContextStore │
│ │ └── store/fetch/delete context by reference │
│ │ │
│ ├── Jido.AI.RLM.WorkspaceStore │
│ │ └── per-request exploration state (ETS) │
│ │ │
│ ├── RLM Exploration Tools (Jido Actions) │
│ │ ├── Context.Stats │
│ │ ├── Context.Chunk │
│ │ ├── Context.ReadChunk │
│ │ ├── Context.Search │
│ │ ├── Workspace.Note │
│ │ ├── Workspace.GetSummary │
│ │ └── LLM.SubqueryBatch │
│ │ │
│ └── Directive lifting (LLMStream + ToolExec) │
│ └── injects next_step_prompt + workspace summary │
└──────────────────────────────────────────────────────────┘
The ReAct machine already solves:
- Tool call correlation IDs and parallel execution
- Iteration limits with max_iterations guard
- Busy rejection (
EmitRequestError) - Deadlock avoidance (
EmitToolErrorfor unknown tools) - Streaming token accumulation
- Thread-based conversation history
- Usage tracking and telemetry
RLM needs different tools and prompts, not different states. The strategy adapter layer handles the differences:
- Injects RLM system prompt instead of generic ReAct prompt
- Appends workspace-aware
next_step_promptbefore each LLM call - Manages context/workspace lifecycle around the machine
This is the same pattern TRM uses: TRM routes react.llm.response signals to its own action atoms and builds phase-specific prompts, while reusing LLMStream directives.
lib/jido_ai/
├── strategies/
│ └── rlm.ex # Strategy adapter
├── rlm/
│ ├── context_store.ex # Context storage (ETS/inline)
│ ├── workspace_store.ex # Per-request exploration state
│ └── prompts.ex # System + per-iteration prompts
├── actions/rlm/
│ ├── context/
│ │ ├── stats.ex # Context size/type info
│ │ ├── chunk.ex # Chunking strategy
│ │ ├── read_chunk.ex # Fetch chunk text
│ │ └── search.ex # Substring/regex search
│ ├── workspace/
│ │ ├── note.ex # Record hypothesis/finding
│ │ └── get_summary.ex # Compact workspace summary
│ └── llm/
│ └── subquery_batch.ex # Concurrent sub-LLM delegation
├── agents/strategies/
│ └── rlm_agent.ex # Agent macro
└── agents/examples/
└── needle_haystack_agent.ex # Example agent
scripts/
└── test_rlm_agent.exs # Runnable demo
Stores the large input context and returns a reference that tools use to access it. Prevents copying GB-scale data through signals and messages.
| Size | Backend | Reference |
|---|---|---|
| < 2 MB (configurable) | Inline in run_tool_context |
%{backend: :inline, data: binary} |
| 2 MB – 200 MB | Private ETS table (owned by agent process) | %{backend: :ets, table: tid, key: {request_id, :context}, size_bytes: n} |
| > 200 MB (v2) | Temp file | %{backend: :file, path: "...", size_bytes: n} |
defmodule Jido.AI.RLM.ContextStore do
@type context_ref :: %{backend: :inline | :ets | :file, ...}
@spec put(binary(), String.t(), keyword()) :: {:ok, context_ref()}
def put(context, request_id, opts \\ [])
@spec fetch(context_ref()) :: {:ok, binary()} | {:error, :not_found}
def fetch(context_ref)
@spec fetch_range(context_ref(), non_neg_integer(), non_neg_integer()) :: {:ok, binary()}
def fetch_range(context_ref, byte_offset, length)
@spec delete(context_ref()) :: :ok
def delete(context_ref)
@spec size(context_ref()) :: non_neg_integer()
def size(context_ref)
endETS table is private, owned by the agent process — automatically freed on process crash/termination. No manual cleanup needed for the crash case.
Per-request exploration state that tools read from and write to. This is the BEAM equivalent of "REPL persistent variables" — explicit, typed, inspectable.
%{
query: "Find the magic number",
context_ref: %{backend: :ets, ...},
chunks: %{
strategy: :lines,
size: 1000,
index: %{"c_0" => %{byte_start: 0, byte_end: 12345, lines: "1-1000"}, ...}
},
hits: [
%{chunk_id: "c_47", offset: 47231, snippet: "The magic number is 1298418"}
],
notes: [
%{kind: :hypothesis, text: "Magic number appears in middle third", at: ~U[...]}
],
subquery_results: [
%{chunk_id: "c_47", model: "anthropic:claude-haiku-4-5", answer: "1298418"}
]
}defmodule Jido.AI.RLM.WorkspaceStore do
@spec init(String.t(), map()) :: {:ok, workspace_ref()}
def init(request_id, seed \\ %{})
@spec get(workspace_ref()) :: map()
def get(workspace_ref)
@spec update(workspace_ref(), (map() -> map())) :: :ok
def update(workspace_ref, fun)
@spec summary(workspace_ref(), keyword()) :: String.t()
def summary(workspace_ref, opts \\ [])
@spec delete(workspace_ref()) :: :ok
def delete(workspace_ref)
endStorage: ETS keyed by {request_id, :workspace}. Same table as ContextStore (one private ETS table per agent for all RLM data).
The LLM sees workspace state through two channels:
- Tool results — each tool returns its results as structured data, which the ReAct machine appends to the Thread as tool_result messages
- Next-step prompt — before each LLM call, the strategy injects a user message containing
WorkspaceStore.summary/2(compact text: "You've searched 3 chunks, found 1 hit, have 2 hypotheses...")
This replaces the Python pattern of the model accessing REPL variables directly.
Each tool is a standard Jido.Action — typed schema, run/2 function, composable via ToolAdapter. All tools receive context_ref and workspace_ref through the tool execution context (the context argument to run/2), which comes from run_tool_context.
use Jido.Action,
name: "context_stats",
description: "Get size and structure information about the loaded context",
schema: Zoi.object(%{})
def run(_params, context) do
ref = context.context_ref
size = ContextStore.size(ref)
sample = ContextStore.fetch_range(ref, 0, min(500, size))
{:ok, %{size_bytes: size, approx_lines: estimate_lines(size, sample), encoding: detect_encoding(sample)}}
enduse Jido.Action,
name: "context_chunk",
description: "Split context into chunks and index them for exploration",
schema: Zoi.object(%{
strategy: Zoi.enum(["lines", "bytes"]) |> Zoi.default("lines"),
size: Zoi.integer() |> Zoi.default(1000),
overlap: Zoi.integer() |> Zoi.default(0),
max_chunks: Zoi.integer() |> Zoi.default(500),
preview_bytes: Zoi.integer() |> Zoi.default(100)
})
def run(params, context) do
# Compute chunk boundaries from context_ref
# Store chunk index in workspace
# Return bounded list of chunk descriptors with previews
{:ok, %{chunk_count: n, chunks: [%{id: "c_0", lines: "1-1000", preview: "..."}]}}
enduse Jido.Action,
name: "context_read_chunk",
description: "Read the text content of a specific chunk",
schema: Zoi.object(%{
chunk_id: Zoi.string(),
max_bytes: Zoi.integer() |> Zoi.default(50_000)
})
def run(params, context) do
# Look up chunk boundaries from workspace
# Fetch text from context_ref using byte range
{:ok, %{chunk_id: params.chunk_id, text: chunk_text, truncated: false}}
enduse Jido.Action,
name: "context_search",
description: "Search the context for a substring or regex pattern",
schema: Zoi.object(%{
query: Zoi.string(),
mode: Zoi.enum(["substring", "regex"]) |> Zoi.default("substring"),
limit: Zoi.integer() |> Zoi.default(20),
window_bytes: Zoi.integer() |> Zoi.default(200)
})
def run(params, context) do
# Search context_ref for matches
# Store hits in workspace
# Return hits with surrounding context snippets
{:ok, %{total_matches: n, hits: [%{offset: 47231, chunk_id: "c_47", snippet: "..."}]}}
enduse Jido.Action,
name: "workspace_note",
description: "Record a hypothesis, finding, or plan in the exploration workspace",
schema: Zoi.object(%{
text: Zoi.string(),
kind: Zoi.enum(["hypothesis", "finding", "plan"]) |> Zoi.default("finding")
})
def run(params, context) do
WorkspaceStore.update(context.workspace_ref, fn ws ->
Map.update(ws, :notes, [note], &(&1 ++ [note]))
end)
summary = WorkspaceStore.summary(context.workspace_ref)
{:ok, %{recorded: true, workspace_summary: summary}}
enduse Jido.Action,
name: "workspace_summary",
description: "Get a compact summary of exploration progress so far",
schema: Zoi.object(%{
max_chars: Zoi.integer() |> Zoi.default(2000)
})
def run(params, context) do
summary = WorkspaceStore.summary(context.workspace_ref, max_chars: params.max_chars)
{:ok, %{summary: summary}}
endThe BEAM advantage. Fan out sub-LLM calls concurrently under the agent's TaskSupervisor.
use Jido.Action,
name: "llm_subquery_batch",
description: "Run a sub-LLM query across multiple chunks concurrently. Use for map-reduce style analysis.",
schema: Zoi.object(%{
chunk_ids: Zoi.list(Zoi.string()),
prompt: Zoi.string(),
model: Zoi.string() |> Zoi.optional(),
max_concurrency: Zoi.integer() |> Zoi.default(10),
timeout: Zoi.integer() |> Zoi.default(60_000),
max_chunk_bytes: Zoi.integer() |> Zoi.default(50_000)
})
def run(params, context) do
model = params[:model] || context[:recursive_model] || "anthropic:claude-haiku-4-5"
workspace = WorkspaceStore.get(context.workspace_ref)
results =
params.chunk_ids
|> Task.async_stream(
fn chunk_id ->
chunk_text = fetch_chunk_text(chunk_id, workspace, context.context_ref, params.max_chunk_bytes)
prompt = "#{params.prompt}\n\nContext:\n#{chunk_text}"
case ReqLLM.Generation.generate_text(model, prompt, []) do
{:ok, response} -> {:ok, %{chunk_id: chunk_id, answer: response.text}}
{:error, reason} -> {:error, %{chunk_id: chunk_id, error: inspect(reason)}}
end
end,
max_concurrency: params.max_concurrency,
timeout: params.timeout,
on_timeout: :kill_task
)
|> Enum.map(fn
{:ok, result} -> result
{:exit, :timeout} -> {:error, %{error: "timeout"}}
{:exit, reason} -> {:error, %{error: inspect(reason)}}
end)
# Store results in workspace
WorkspaceStore.update(context.workspace_ref, fn ws ->
Map.update(ws, :subquery_results, results, &(&1 ++ results))
end)
successes = Enum.filter(results, &match?({:ok, _}, &1)) |> Enum.map(&elem(&1, 1))
errors = Enum.filter(results, &match?({:error, _}, &1)) |> length()
{:ok, %{completed: length(successes), errors: errors, results: successes}}
endTwo prompt builders, following the pattern from Jido.AI.TRM.Reasoning.
Teaches the LLM the exploration methodology and available tools. No mention of code blocks or REPL — the LLM uses standard tool calling.
def system_prompt(config) do
tools_desc = format_tool_descriptions(config.tools)
"""
You are a data analyst exploring a large context to answer a user's question.
You have access to a workspace that persists across iterations. Use the available
tools to systematically explore the context and build toward an answer.
## Available Tools
#{tools_desc}
## Methodology
1. Start by checking context stats to understand size and structure
2. Create a chunking plan appropriate for the context size
3. Search for relevant patterns, or delegate analysis to sub-LLM queries
4. Record hypotheses and findings in the workspace
5. When confident, provide your final answer directly (no tool calls)
## Guidelines
- Never try to read the entire context at once — chunk and search strategically
- Use llm_subquery_batch for map-reduce style analysis across many chunks
- Record your reasoning with workspace_note so you don't lose track
- When you have enough evidence, answer directly — do not call more tools
"""
endInjected before each LLM call with current workspace state.
def next_step_prompt(%{query: query, iteration: iteration, workspace_summary: summary}) do
base = case iteration do
1 ->
"""
You have not explored the context yet. Start by examining its structure.
Query: "#{query}"
"""
_ ->
"""
Continue exploring to answer the query: "#{query}"
## Exploration Progress
#{summary}
Decide your next action: search, delegate to sub-LLM, or provide your final answer.
"""
end
%{role: :user, content: base}
endThin adapter following the exact pattern of Strategies.ReAct and Strategies.TRM. Uses ReAct.Machine internally.
use Jido.Agent,
name: "my_rlm_agent",
strategy: {
Jido.AI.Strategies.RLM,
model: "anthropic:claude-sonnet-4-20250514",
recursive_model: "anthropic:claude-haiku-4-5",
max_iterations: 15,
context_inline_threshold: 2_000_000, # 2 MB
max_concurrency: 10
}@action_specs %{
@start => %{
schema: Zoi.object(%{
query: Zoi.string(),
context: Zoi.any() |> Zoi.optional(),
context_ref: Zoi.map() |> Zoi.optional(),
tool_context: Zoi.map() |> Zoi.optional()
}),
doc: "Start RLM context exploration with a query and large context",
name: "rlm.start"
},
@llm_result => %{
schema: Zoi.object(%{call_id: Zoi.string(), result: Zoi.any()}),
doc: "Handle LLM response",
name: "rlm.llm_result"
},
@tool_result => %{
schema: Zoi.object(%{call_id: Zoi.string(), tool_name: Zoi.string(), result: Zoi.any()}),
doc: "Handle tool execution result",
name: "rlm.tool_result"
},
@llm_partial => %{
schema: Zoi.object(%{call_id: Zoi.string(), delta: Zoi.string(), chunk_type: Zoi.atom() |> Zoi.default(:content)}),
doc: "Handle streaming LLM token",
name: "rlm.llm_partial"
}
}def signal_routes(_ctx) do
[
{"rlm.explore", {:strategy_cmd, @start}},
{"react.llm.response", {:strategy_cmd, @llm_result}},
{"react.tool.result", {:strategy_cmd, @tool_result}},
{"react.llm.delta", {:strategy_cmd, @llm_partial}},
{"react.usage", Jido.Actions.Control.Noop}
]
endThis is the same pattern TRM uses — routing react.llm.response to its own action atoms.
On :rlm_start:
defp process_start(agent, %{query: query} = params) do
config = get_config(agent)
# 1. Store context, get reference
context_ref = store_context(params, config)
# 2. Initialize workspace
workspace_ref = WorkspaceStore.init(request_id, %{query: query, context_ref: context_ref})
# 3. Set run_tool_context (ephemeral, per-request)
tool_context = Map.merge(params[:tool_context] || %{}, %{
context_ref: context_ref,
workspace_ref: workspace_ref,
recursive_model: config.recursive_model
})
agent = set_run_tool_context(agent, tool_context)
# 4. Send to ReAct machine with RLM system prompt
msg = {:start, query, call_id}
env = %{system_prompt: Prompts.system_prompt(config), max_iterations: config.max_iterations}
{machine, directives} = Machine.update(machine, msg, env)
# 5. Lift directives, injecting next_step_prompt
{agent, lift_directives(directives, config, state)}
endWhen lifting {:call_llm_stream, id, conversation}, inject the workspace-aware next-step prompt:
defp lift_directives(directives, config, state) do
Enum.flat_map(directives, fn
{:call_llm_stream, id, conversation} ->
# Inject RLM next-step context
workspace_summary = WorkspaceStore.summary(state.workspace_ref)
iteration = state[:iteration] || 1
next_step = Prompts.next_step_prompt(%{
query: state[:query],
iteration: iteration,
workspace_summary: workspace_summary
})
augmented = conversation ++ [next_step]
[Directive.LLMStream.new!(%{
id: id,
model: config.model,
context: convert_to_reqllm_context(augmented),
tools: config.reqllm_tools
})]
{:exec_tool, id, tool_name, arguments} ->
# Same as ReAct — lookup Action, build ToolExec directive
# ...
{:request_error, call_id, reason, message} ->
[Directive.EmitRequestError.new!(%{call_id: call_id, reason: reason, message: message})]
end)
endOn terminal states (:completed or :error), clean up ETS:
# In process_instruction, after Machine.update:
new_state = if machine_state[:status] in [:completed, :error] do
ContextStore.delete(state[:context_ref])
WorkspaceStore.delete(state[:workspace_ref])
Map.delete(machine_state, :run_tool_context)
else
machine_state
endFollowing ReActAgent and TRMAgent conventions exactly.
defmodule Jido.AI.RLMAgent do
defmacro __using__(opts) do
# Same structure as ReActAgent:
# - Extract name, tools (auto-include RLM exploration tools), model, etc.
# - Build schema with request tracking fields
# - Wire Jido.AI.Strategies.RLM as strategy
# - Generate explore/3, await/2, explore_sync/3
# - Generate on_before_cmd/on_after_cmd for request tracking
end
enddefmodule MyApp.NeedleHaystackAgent do
use Jido.AI.RLMAgent,
name: "needle_haystack",
description: "Finds information in massive text contexts",
model: "anthropic:claude-sonnet-4-20250514",
recursive_model: "anthropic:claude-haiku-4-5",
max_iterations: 15,
extra_tools: [] # optional additional domain-specific tools
end
# Usage
{:ok, pid} = Jido.start_agent(Jido.default_instance(), MyApp.NeedleHaystackAgent)
{:ok, result} = MyApp.NeedleHaystackAgent.explore_sync(pid,
"Find the magic number hidden in this text",
context: massive_text_binary,
timeout: 300_000
)| Function | Description |
|---|---|
explore(pid, query, opts) |
Async — returns {:ok, %Request.Handle{}} |
await(request, opts) |
Await specific request result |
explore_sync(pid, query, opts) |
Sync convenience wrapper |
cancel(pid, opts) |
Cancel in-flight request |
Options for explore/3:
context:— binary, iodata, or%{path: "..."}for file-backedcontext_ref:— pre-stored context reference (advanced)tool_context:— additional per-request context merged with basetimeout:— request timeout
User: explore("Find the magic number", context: <100K lines>)
│
▼
┌─────────────────────────────────────────────────────────┐
│ Strategy.RLM — :rlm_start │
│ 1. ContextStore.put(context) → context_ref │
│ 2. WorkspaceStore.init(request_id) → workspace_ref │
│ 3. Set run_tool_context = {context_ref, workspace_ref} │
│ 4. Machine.update({:start, query, call_id}, env) │
│ 5. Emit LLMStream + next_step_prompt(iteration: 1) │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Root LLM (iteration 1) │
│ → tool_call: context_stats({}) │
│ → tool_call: context_chunk({strategy: "lines", │
│ size: 1000}) │
│ │
│ Machine: awaiting_llm → awaiting_tool │
│ Directives: [ToolExec(context_stats), ToolExec(chunk)] │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Tool Results │
│ context_stats: {size: 4.8MB, lines: ~100K} │
│ context_chunk: {chunks: 100, index stored in workspace} │
│ │
│ Machine: awaiting_tool → awaiting_llm │
│ → Append tool results to Thread │
│ → Emit LLMStream + next_step_prompt(iteration: 2, │
│ workspace: "100 chunks indexed, 0 hits") │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Root LLM (iteration 2) │
│ → tool_call: context_search({query: "magic number"}) │
│ │
│ Result: {hits: [{chunk_id: "c_47", snippet: "The magic │
│ number is 1298418", offset: 47231}]} │
│ │
│ → Append to Thread + workspace │
│ → Emit LLMStream + next_step_prompt(iteration: 3, │
│ workspace: "1 hit found: 'magic number' in c_47") │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Root LLM (iteration 3) │
│ → Final answer: "The magic number is 1298418" │
│ (no tool calls — standard ReAct final answer) │
│ │
│ Machine: awaiting_llm → completed │
│ Strategy: cleanup context_ref + workspace_ref │
└─────────────────────────────────────────────────────────┘
| # | Role | Content |
|---|---|---|
| 1 | system | RLM exploration methodology + tool descriptions |
| 2 | user | "You haven't explored yet..." + query |
| 3 | assistant | (tool_calls: context_stats, context_chunk) |
| 4 | tool | context_stats result: {size: 4.8MB, lines: ~100K} |
| 5 | tool | context_chunk result: {chunks: 100, ...} |
| 6 | user | "Continue... Workspace: 100 chunks, 0 hits" + query |
| 7 | assistant | (tool_call: context_search) |
| 8 | tool | context_search result: {hits: [{chunk_id: c_47, ...}]} |
| 9 | user | "Continue... Workspace: 1 hit found" + query |
| 10 | assistant | "The magic number is 1298418" |
The Python version runs llm_query() calls sequentially inside exec(). With SubqueryBatch:
# Root LLM calls one tool:
tool_call: llm_subquery_batch({
chunk_ids: ["c_0", "c_1", ..., "c_99"],
prompt: "Does this chunk contain a magic number? If so, what is it?",
max_concurrency: 10
})
# Action fans out 100 concurrent sub-LLM calls via Task.async_stream
# Returns aggregated results in ~1/10th the time of sequentialEach RLM session runs in its own agent process. A failed search action or timed-out sub-LLM call doesn't affect other sessions. The supervisor restarts cleanly.
Every operation is a Jido Action with built-in telemetry:
[:jido, :ai, :tool, :execute, :start] # context_search started
[:jido, :ai, :tool, :execute, :stop] # context_search completed (with duration)
[:jido, :ai, :react, :start] # RLM exploration started
[:jido, :ai, :react, :iteration] # iteration N completed
[:jido, :ai, :react, :complete] # exploration finishedEvery tool call is schema-validated via Zoi before execution. Invalid arguments from the LLM produce structured errors (via EmitToolError) that the LLM can self-correct from — no silent failures or runtime exceptions from malformed code.
| Step | Module | Effort | Tests |
|---|---|---|---|
| 1 | Jido.AI.RLM.ContextStore |
S | Unit: put/fetch/delete, tier selection, byte range reads |
| 2 | Jido.AI.RLM.WorkspaceStore |
S | Unit: init/get/update/summary/delete |
| 3 | Context.Stats |
S | Unit: size estimation, encoding detection |
| 4 | Context.Chunk |
M | Unit: line/byte chunking, overlap, max_chunks cap |
| 5 | Context.ReadChunk |
S | Unit: chunk lookup, truncation, missing chunk error |
| 6 | Context.Search |
M | Unit: substring/regex, limit, window, chunk_id mapping |
| 7 | Workspace.Note + GetSummary |
S | Unit: append, summarize, truncation |
| 8 | LLM.SubqueryBatch |
M | Unit: fan-out, timeout handling, result aggregation (mock ReqLLM) |
| 9 | Jido.AI.RLM.Prompts |
S | Unit: prompt generation for various iterations/states |
| 10 | Jido.AI.Strategies.RLM |
L | Integration: start flow, directive lifting, cleanup, multi-iteration |
| 11 | Jido.AI.RLMAgent macro |
M | Integration: explore/await/explore_sync, request tracking |
| 12 | Example agent + demo script | S | Manual: needle-in-haystack with generated context |
Steps 1–9 are independently testable with no LLM calls. Step 10 is where integration happens.
- Machine: Reuse
ReAct.Machine— no fork - Context storage: ETS for medium, inline for small
- Termination: Standard ReAct final-answer (no FINAL markers)
- Sub-LLM: Single
SubqueryBatchAction withTask.async_stream - Signal namespace:
rlm.explorefor input, reusereact.*for internal signals
- File-backed context: For GB-scale data. Requires streaming IO in chunk/search actions.
- Depth > 1 recursion: Sub-LLM spawns its own RLM agent. Trivial on BEAM (spawn child process), but needs prompt engineering for recursive delegation.
- Multi-turn with persistent workspace: Currently workspace is per-request. Supporting
explore → follow-up explorewith shared workspace requires moving workspace to agent state. - Custom indexing: Token-aware chunking, BM25 search, vector embeddings. Each is a new Action — the architecture supports them without changes.
- Streaming search results: Stream search hits back to the LLM as they're found, rather than waiting for all results. Would need a new machine state or a streaming tool result pattern.