Hivemind: Agents as Processes, Context as a HUD

February 2026

Every agent framework today does the same thing: wrap an LLM in a chat loop, bolt on tools, pray the context window holds. When it doesn't, summarize and hope for the best.

We think there's a better way. Hivemind is an agent orchestration framework built on Elixir/OTP where agents are long-running processes, tools can be forged at runtime, and the context window is a navigable view layer — not a dumping ground.

Why Elixir?

An agent is a GenServer. It has a mailbox, state, a supervisor that restarts it on crash. While a tool runs for 30 seconds, the agent keeps processing — human messages queue, heartbeats fire, other agents communicate. No blocking. No hand-rolled concurrency. The BEAM VM was built for exactly this.

Crash isolation — one agent dies, the rest don't notice
Hot code reload — update agent behavior mid-conversation
Distribution — agents across machines, same send/2 call
Supervision trees — automatic restart with state recovery from WAL

The Context HUD

This is the big idea.

Every framework treats the context window as storage. History grows, you summarize, information dies. The Recursive Language Models paper (Zhang, Kraska & Khattab, MIT CSAIL) nailed why this fails: compaction is lossy, and agents need dense access to history they can't see anymore.

Their insight: treat the prompt as navigable external state, not as something stuffed into the token window. The model sees metadata and handles, not raw data.

We took that idea and built it for long-running agents. Instead of a flat message list, the LLM sees this every turn:

┌─ CONTEXT HUD ──────────────────────────────────────────┐
│                                                        │
│  INBOX (3 unprocessed)                                 │
│  • User: "How's the deploy going?"                     │
│  • Tool: exec completed (exit 0)                       │
│  • Sub-agent: "PR #42 tests passing"                   │
│                                                        │
│  SITUATION                                             │
│  Working on: deployment pipeline v0.3                  │
│  Active sub-agents: 2                                  │
│                                                        │
│  MEMORY INDEX                                          │
│  │ L2 │ 09:00-10:00 │ architecture review, PR feedback │
│  │ L1 │ 10:00-10:10 │ deploy script debugging          │
│  │ L1 │ 10:10-10:20 │ test fixes, CI green             │
│  │ L0 │ 10:20-now   │ [5 raw events]                   │
│                                                        │
│  PINS                                                  │
│  • Decision: use blue-green deploys (chunk_L1_0942)    │
│                                                        │
│  LIVE PANELS                                           │
│  • CI: green │ Error rate: 0.02%                       │
│                                                        │
│  BUDGET: 180K tokens remaining                         │
└────────────────────────────────────────────────────────┘

Bounded. Same token cost whether the agent's been running 5 minutes or 5 days.

Behind the HUD sits a multi-resolution time pyramid — raw events roll up into 10-minute summaries (L1), which roll up into hourly summaries (L2), which roll up into daily. Nothing is ever destroyed. Every chunk keeps pointers to its children.

When the agent needs history, it navigates:

ctx_timeline → "Show me chunks from the last 6 hours" → drill into any one
ctx_search → semantic search over all chunks by topic
ctx_open → expand any chunk to see its children or raw events
ctx_pin → "Keep this decision visible for the rest of the session"

The model sees an index. It pulls what it needs. Compaction becomes organization, not destruction.

The Tool Forge

Agents can write, compile, and load new Elixir tools at runtime. Bytecode-validated, sandboxed, supervised. An agent that keeps regex-ing JIRA tickets can build a jira_extractor tool, register it, and use it forever — or share it with every other agent.

The BEAM's hot code reload makes this natural. No restart. No downtime.

Archetypes

Sub-agents accumulate wisdom across runs. The 100th "coder" agent starts with learnings from the first 99 — what works, what fails, style preferences. Confidence scores decay when unused, grow when reinforced. Agents get better at their roles over time.

Status

Core engine is complete: Agent GenServer, LLM adapters, tool system + forge, WAL persistence, CLI, TUI, 500+ tests. The Context HUD is in active implementation — 23 issues across 4 phases.

github.com/solofberlin/hivemind

It's early. Come build with us.

The Context HUD design builds on the Recursive Language Models paper by Zhang, Kraska & Khattab (MIT CSAIL) — their framework for treating prompts as navigable external environment state, extended here to long-running stateful agents on the BEAM.

bowd/hivemind.md

Select an option

No results found