Analysis of "Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI" (arXiv:2509.20175) and its application to MCP tool discovery.
From the paper, each agent advertises a VCV:
VCV = (π, π¬, π«, π©, π, v)
π β βα΅ = Dense capability embedding (semantic description)
π¬ β {0,1}Λ‘ = Bloom filter of discrete skills
π« β βα΅ = Resource requirements (latency, cost, throughput)
π© β {0,1}α΅ = Policy compliance flags (regulatory, security)
π β βα΅' = Specification embedding (behavioral constraints)
v β β = Version counter
Ξ± = sim(π_task, π_agent) Β· π[π©_required β π©_granted] Β· f(resources) Β· g(spec_alignment)
β β β β
Semantic match Policy gate Cost penalty Behavioral fit
Key insight: Policy check is a hard gate (fails = score 0), while semantic match is soft scoring.
| FoA Concept | MCP Tool Equivalent |
|---|---|
| Agent VCV | Tool Capability Profile |
| Capability embedding (π) | Tool description embedding |
| Discrete skills (π¬) | Capability tags (e.g., "documents:read") |
| Policy flags (π©) | UMES entitlements |
| Resource hints (π«) | Latency/cost estimates |
| Version (v) | Tool version for cache invalidation |
Adapting VCV for MCP tools:
from pydantic import BaseModel, Field
from typing import Literal
class ToolCapabilityProfile(BaseModel):
"""FoA-inspired tool capability profile for semantic discovery."""
# Identity
name: str = Field(description="Tool name")
version: int = Field(default=1, description="Version counter for cache invalidation")
# Semantic embedding (π) - computed from description
description: str = Field(description="Natural language description for embedding")
embedding: list[float] | None = Field(default=None, description="Pre-computed 256-dim embedding")
# Discrete capabilities (π¬) - for exact matching and UMES
capabilities: list[str] = Field(description="Required capabilities (e.g., 'documents:read')")
# Policy flags (π©) - UMES entitlement requirements
required_entitlements: list[str] = Field(default_factory=list, description="UMES entitlements required")
# Resource hints (π«) - for cost-aware routing
estimated_latency_ms: int = Field(default=100, description="Expected latency in ms")
cost_tier: Literal["free", "low", "medium", "high"] = Field(default="low")
# Behavioral spec (π) - operational constraints
idempotent: bool = Field(default=True, description="Safe to retry")
side_effects: Literal["none", "read", "write", "delete"] = Field(default="read")
# Example profiles
TOOL_PROFILES: dict[str, ToolCapabilityProfile] = {
"delete_knowledge_pack": ToolCapabilityProfile(
name="delete_knowledge_pack",
description="Delete a knowledge pack from HexGraph. Removes all associations and optionally deletes documents.",
capabilities=["knowledge_packs:delete"],
required_entitlements=["admin:system"],
estimated_latency_ms=500,
cost_tier="medium",
idempotent=False,
side_effects="delete",
),
"query": ToolCapabilityProfile(
name="query",
description="Execute a HiRAG query against documents with multiple retrieval modes.",
capabilities=["queries:execute", "documents:read"],
required_entitlements=["documents:read"],
estimated_latency_ms=200,
cost_tier="low",
idempotent=True,
side_effects="read",
),
"search_requirements": ToolCapabilityProfile(
name="search_requirements",
description="Search regulatory requirements within a framework. Returns verbatim legal text with citations.",
capabilities=["regulatory:read"],
required_entitlements=["regulatory:read"],
estimated_latency_ms=150,
cost_tier="low",
idempotent=True,
side_effects="read",
),
}import numpy as np
from sentence_transformers import SentenceTransformer
class ToolRegistry:
"""FoA-inspired semantic tool registry with UMES policy enforcement."""
def __init__(self):
self.encoder = SentenceTransformer('nomic-ai/nomic-embed-text-v1.5', trust_remote_code=True)
self.profiles: dict[str, ToolCapabilityProfile] = {}
self.embeddings: np.ndarray | None = None
self.tool_names: list[str] = []
def register(self, profile: ToolCapabilityProfile):
"""Register a tool with pre-computed embedding."""
if profile.embedding is None:
profile.embedding = self.encoder.encode(profile.description).tolist()
self.profiles[profile.name] = profile
self._rebuild_index()
def _rebuild_index(self):
"""Rebuild HNSW index (simplified: numpy array for now)."""
self.tool_names = list(self.profiles.keys())
self.embeddings = np.array([
self.profiles[name].embedding
for name in self.tool_names
])
def search(
self,
query: str,
user_capabilities: set[str],
limit: int = 5,
include_cost: bool = True,
) -> list[dict]:
"""
FoA-style semantic search with policy enforcement.
Ξ± = sim(query, tool) Β· π[required β granted] Β· cost_penalty
"""
# Embed query
query_embedding = self.encoder.encode(query)
# Compute semantic similarities
similarities = np.dot(self.embeddings, query_embedding)
results = []
for i, tool_name in enumerate(self.tool_names):
profile = self.profiles[tool_name]
# Policy gate: π[required β granted]
required = set(profile.required_entitlements)
if not required.issubset(user_capabilities):
continue # Hard gate - not entitled
# Semantic score
semantic_score = float(similarities[i])
# Cost penalty (optional)
cost_penalty = 1.0
if include_cost:
cost_multipliers = {"free": 1.0, "low": 0.95, "medium": 0.85, "high": 0.7}
cost_penalty = cost_multipliers.get(profile.cost_tier, 1.0)
# Final score
final_score = semantic_score * cost_penalty
results.append({
"name": tool_name,
"description": profile.description,
"capabilities": profile.capabilities,
"score": round(final_score, 4),
"latency_ms": profile.estimated_latency_ms,
"side_effects": profile.side_effects,
})
# Sort by score descending
results.sort(key=lambda x: x["score"], reverse=True)
return results[:limit]# Agent searches for tools
registry = ToolRegistry()
for profile in TOOL_PROFILES.values():
registry.register(profile)
# User with limited entitlements
user_caps = {"regulatory:read", "documents:read"}
# Search: "I need to find GDPR data retention requirements"
results = registry.search(
query="find GDPR data retention requirements",
user_capabilities=user_caps,
limit=3,
)
# Returns: [search_requirements, query, get_requirement]
# Does NOT return: delete_knowledge_pack (user lacks admin:system)| Aspect | Pack System | FoA-Style Semantic |
|---|---|---|
| Discovery | Know pack names | Natural language query |
| Matching | Exact pack membership | Semantic similarity + policy gate |
| UMES Integration | Pack-level grants | Capability-level gates |
| Scalability | O(packs) | O(log n) with HNSW |
| Context efficiency | Load N tools per pack | Load top-K matches |
| Future-proof | Poor | Excellent |
We implemented a simpler capability-based system instead of full FoA semantic search:
Rationale:
- UMES is the priority - Capabilities map directly to entitlements
- Keeps what works - Pack system unchanged for context management
- No ML dependencies - Simpler ops, no embedding service required
- Clear upgrade path - Can add semantic search later if needed
- 32 tools is manageable - Semantic search overkill for current scale
What We Implemented:
- 12 capabilities:
regulatory:read,documents:read/write/delete,queries:execute,conversations:read/manage,collections:manage,views:manage,admin:read,jobs:manage,knowledge_packs:delete - 4 predefined roles:
viewer,analyst,editor,admin(hierarchical) filter_tools_by_capabilities()function for UMES integration- Backward compatible: no restrictions if UMES not configured
- arXiv:2509.20175 - Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
- arXiv HTML - Full paper with diagrams