SlackAgents MVP: Bringing a Team Back to Life

Before I show you how this was built, here’s the part that made me stop and reread the transcript.

In the first live test, three agents got into a round-robin discussion about improving an ETF platform. Within a couple of turns, one of them dropped a line that instantly felt like a real teammate in a real design review:

“Bonds tolerate minutes of delay. ETF arb disappears in seconds.”

Another agent immediately shifted the conversation from architecture to usability — not in a generic way, but in a “we’ve lived this pain” way — proposing a traffic-light UI with drill-downs and explicit data quality flags. Then a third agent pulled the whole thing back into execution reality: staged rollout, market-by-market, timezone-aware.

No one was role-playing. No one was guessing wildly. They sounded like people who had shipped systems together — because they were built from the way those people actually talked.

That’s when I knew: this experiment worked.

Executive Summary

Built a working multi-agent conversation system that brings the Katana Labs team back to life from archival data. Sixteen AI agents, each constructed from real Slack messages and GitLab documentation, can hold round-robin discussions, answer domain questions, and demonstrate distinct personalities with accurate technical depth. The system went from plan to working slash command in a single session.

This is the experiment that finally worked. After months of forensic analysis, document processing frameworks, and knowledge extraction pipelines, the simplest possible approach — keyword search over a small corpus, no vector database, no embeddings — produced the most convincing and useful result.

The Core Idea

Reconstruct institutional memory from two sources:

Layer 1 — Personality (Slack Export)

~35,000 messages
27 channels
2019–2025 archive
49 users (16 active enough for persona modeling)

Slack gives you:

Voice
Humor
Vocabulary
Collaboration patterns
Who challenged whom
How decisions actually got made

Not what was built — but why.

Layer 2 — Domain Knowledge (GitLab Archive)

3.6GB repository
124 markdown documents
Architecture
Trading algorithms
Database schemas
IP portfolio
Investor materials
ML experiments

GitLab gives you:

What was built
How it works
Business context
Technical depth

Slack teaches how they talk. GitLab teaches what they know.

Blend the two — and you get people.

Architecture Decisions That Made This Work

1. No Embeddings

The corpus:

~274K tokens of Slack
564 knowledge chunks

That’s tiny.

Keyword scoring across everything takes milliseconds. No vector store. No indexing layer. No infra.

Complexity removed = signal amplified.

2. Stateless API Calls

Every turn includes:

Full persona system prompt
Retrieved Slack examples
Retrieved knowledge chunks
Conversation transcript

No session state. No memory layer.

Clean. Predictable. Debuggable.

3. Role-Based Knowledge Scoping

Dennis → architecture + ML
Santiago → investor materials + fixed income
Androniki → business + product

No cross-contamination.

Agents stay inside their real-world expertise boundaries.

That’s why the discussions feel authentic.

4. Real Messages, Not Summaries

Each persona includes:

20–25 verbatim Slack messages
Collaboration graph
Style metrics (emoji rate, question rate, tone)
Topic extraction

The model isn’t told “Dennis is technical.” It sees Dennis being technical.

That difference is everything.

Implementation Highlights

Slack Parsing Challenges

Deleted users reconstructed from embedded profiles
Three-layer bot detection
Regex mention resolution (<@U12345> → @RealName)
Thread reconstruction using thread_ts

Result:

20 users with content
16 viable persona candidates

Knowledge Base Fix

First bug discovered:

AGENTS.md and CLAUDE.md were being pulled in as domain knowledge.

These are meta-instructions, not business docs.

Once excluded, the chunk count dropped from 615 → 564.

Small correction. Big difference in response quality.

Slash Command Integration

A new skill group:

/katana

Subcommands:

/katana ask
/katana discuss
/katana list
/katana rebuild

Cross-session invocation works.

That’s the real bar: A fresh session can use it without knowing how it’s implemented.

Live Test: ETF Platform Discussion

Three agents:

Dennis
Androniki
Alexander

Prompt:

“I rebuilt the Katana platform for ETFs. What would you improve?”

What came back wasn’t generic AI filler.

It was structured, domain-specific critique:

Architecture

Proposed an ArbCalculator abstraction layer
Separate implementations for replication types
Beam pipeline reuse

That reflects real Katana infrastructure constraints.

Latency Insight

“Bonds tolerate minutes of delay. ETF arbitrage disappears in seconds.”

That’s not surface-level knowledge.

That’s understanding trading mechanics.

Product Voice

Androniki pushed for:

Traffic-light UI
Drill-down views
Data quality flags
AP activity enrichment

Exactly aligned with her historical Slack behavior.

Rollout Strategy

US first
Europe during US hours
Asia last with timezone logic

Pragmatic. Phased. Realistic.

What’s Convincing

Agents stayed inside their expertise.
They built on each other’s ideas.
They referenced real Katana concepts.
They didn’t sound like clones.

The personalities held.

What Needs Improvement

More disagreement Real teams argue. These agents are too agreeable.
Execution speed 3 agents × 2 rounds ≈ 3m 25s. Acceptable, but not snappy.
Skill robustness Absolute venv path should be default, not uv run.

Why This Worked (When Other Attempts Didn’t)

Earlier efforts involved:

Embeddings
Semantic search engines
Document processing frameworks
Multi-agent forensic analysis pipelines

All technically impressive.

None felt alive.

SlackAgents works because:

The dataset is small — brute force is fine.
Personality is real, not summarized.
Knowledge is scoped by role.
The system is simple enough to reason about.

No abstraction layers. No orchestration frameworks. No magic.

Just careful engineering.

The Deeper Insight

The most valuable artifact in the Katana archive isn’t the code.

It’s the conversations.

Slack captures:

Why Algolia was chosen
Why search performance degraded
What PGGM actually needed
How bond pair scoring evolved
Who pushed back on what

GitLab documents decisions. Slack captures decision-making.

That’s institutional memory.

And now it’s queryable.

Project Scope

1,346 lines of Python
2 dependencies (anthropic, click)
16 working AI agents
Cached personas
Cached knowledge base
Slash command integration
Cross-session reliability

Total implementation time: ~4 hours.

What This Proves

You don’t need:

Vector databases
Retrieval frameworks
Multi-layer memory systems
Heavy orchestration

If the corpus is small and well-structured, simplicity wins.

The agents sound real because the data is real.

The knowledge is grounded because it comes from the source.

And the system works because it’s not trying to be clever.

Next Steps

Introduce structured disagreement in prompts
Add streaming output for faster perceived latency
Support per-invocation model selection
Enable transcript export

But even without those:

The core experiment succeeded.

Closing Thought

A team that no longer exists can now:

Debate architecture
Critique product strategy
Explain trading logic
Answer technical questions

Not because of advanced AI architecture.

Because of clean data, tight scope, and restraint.

Sometimes the right solution isn’t more infrastructure.

It’s less.

thecatfix/SlackAgents.md

Select an option

No results found