February 2026
Every agent framework today does the same thing: wrap an LLM in a chat loop, bolt on tools, pray the context window holds. When it doesn't, summarize and hope for the best.
We think there's a better way. Hivemind is an agent orchestration framework built on Elixir/OTP where agents are long-running processes, tools can be forged at runtime, and the context window is a navigable view layer — not a dumping ground.
An agent is a GenServer. It has a mailbox, state, a supervisor that restarts it on crash. While a tool runs for 30 seconds, the agent keeps processing — human messages queue, heartbeats fire, other agents communicate. No blocking. No hand-rolled concurrency. The BEAM VM was built for exactly this.
- Crash isolation — one agent dies, the rest don't notice
- Hot code reload — update agent behavior mid-conversation
- Distribution — agents across machines, same
send/2call - Supervision trees — automatic restart with state recovery from WAL
This is the big idea.
Every framework treats the context window as storage. History grows, you summarize, information dies. The Recursive Language Models paper (Zhang, Kraska & Khattab, MIT CSAIL) nailed why this fails: compaction is lossy, and agents need dense access to history they can't see anymore.
Their insight: treat the prompt as navigable external state, not as something stuffed into the token window. The model sees metadata and handles, not raw data.
We took that idea and built it for long-running agents. Instead of a flat message list, the LLM sees this every turn:
┌─ CONTEXT HUD ──────────────────────────────────────────┐
│ │
│ INBOX (3 unprocessed) │
│ • User: "How's the deploy going?" │
│ • Tool: exec completed (exit 0) │
│ • Sub-agent: "PR #42 tests passing" │
│ │
│ SITUATION │
│ Working on: deployment pipeline v0.3 │
│ Active sub-agents: 2 │
│ │
│ MEMORY INDEX │
│ │ L2 │ 09:00-10:00 │ architecture review, PR feedback │
│ │ L1 │ 10:00-10:10 │ deploy script debugging │
│ │ L1 │ 10:10-10:20 │ test fixes, CI green │
│ │ L0 │ 10:20-now │ [5 raw events] │
│ │
│ PINS │
│ • Decision: use blue-green deploys (chunk_L1_0942) │
│ │
│ LIVE PANELS │
│ • CI: green │ Error rate: 0.02% │
│ │
│ BUDGET: 180K tokens remaining │
└────────────────────────────────────────────────────────┘Bounded. Same token cost whether the agent's been running 5 minutes or 5 days.
Behind the HUD sits a multi-resolution time pyramid — raw events roll up into 10-minute summaries (L1), which roll up into hourly summaries (L2), which roll up into daily. Nothing is ever destroyed. Every chunk keeps pointers to its children.
When the agent needs history, it navigates:
ctx_timeline→ "Show me chunks from the last 6 hours" → drill into any onectx_search→ semantic search over all chunks by topicctx_open→ expand any chunk to see its children or raw eventsctx_pin→ "Keep this decision visible for the rest of the session"
The model sees an index. It pulls what it needs. Compaction becomes organization, not destruction.
Agents can write, compile, and load new Elixir tools at runtime. Bytecode-validated, sandboxed, supervised. An agent that keeps regex-ing JIRA tickets can build a jira_extractor tool, register it, and use it forever — or share it with every other agent.
The BEAM's hot code reload makes this natural. No restart. No downtime.
Sub-agents accumulate wisdom across runs. The 100th "coder" agent starts with learnings from the first 99 — what works, what fails, style preferences. Confidence scores decay when unused, grow when reinforced. Agents get better at their roles over time.
Core engine is complete: Agent GenServer, LLM adapters, tool system + forge, WAL persistence, CLI, TUI, 500+ tests. The Context HUD is in active implementation — 23 issues across 4 phases.
github.com/solofberlin/hivemind
It's early. Come build with us.
The Context HUD design builds on the Recursive Language Models paper by Zhang, Kraska & Khattab (MIT CSAIL) — their framework for treating prompts as navigable external environment state, extended here to long-running stateful agents on the BEAM.