Skip to content

Instantly share code, notes, and snippets.

@jmanhype
Created December 29, 2025 00:27
Show Gist options
  • Select an option

  • Save jmanhype/0a99ada8fd0897d2810733e2ead6e01d to your computer and use it in GitHub Desktop.

Select an option

Save jmanhype/0a99ada8fd0897d2810733e2ead6e01d to your computer and use it in GitHub Desktop.
FoA-Style Tool Discovery for MCP Servers - Analysis of semantic agent routing applied to MCP tool management

FoA-Style Tool Discovery for MCP Servers

Analysis of "Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI" (arXiv:2509.20175) and its application to MCP tool discovery.

FoA Key Concepts

Versioned Capability Vector (VCV) Structure

From the paper, each agent advertises a VCV:

VCV = (𝐜, 𝐬, 𝐫, 𝐩, 𝐞, v)

𝐜 ∈ β„α΅ˆ     = Dense capability embedding (semantic description)
𝐬 ∈ {0,1}ˑ = Bloom filter of discrete skills
𝐫 ∈ ℝᡐ     = Resource requirements (latency, cost, throughput)
𝐩 ∈ {0,1}α΅– = Policy compliance flags (regulatory, security)
𝐞 ∈ β„α΅ˆ'    = Specification embedding (behavioral constraints)
v ∈ β„•      = Version counter

Routing Function (Multi-factor Scoring)

Ξ± = sim(𝐜_task, 𝐜_agent) Β· 𝕀[𝐩_required βŠ† 𝐩_granted] Β· f(resources) Β· g(spec_alignment)
       ↑                      ↑                            ↑             ↑
    Semantic match         Policy gate              Cost penalty    Behavioral fit

Key insight: Policy check is a hard gate (fails = score 0), while semantic match is soft scoring.


Mapping FoA to MCP Tool Discovery

FoA Concept MCP Tool Equivalent
Agent VCV Tool Capability Profile
Capability embedding (𝐜) Tool description embedding
Discrete skills (𝐬) Capability tags (e.g., "documents:read")
Policy flags (𝐩) UMES entitlements
Resource hints (𝐫) Latency/cost estimates
Version (v) Tool version for cache invalidation

Proposed: Tool Capability Profile (TCP)

Adapting VCV for MCP tools:

from pydantic import BaseModel, Field
from typing import Literal

class ToolCapabilityProfile(BaseModel):
    """FoA-inspired tool capability profile for semantic discovery."""

    # Identity
    name: str = Field(description="Tool name")
    version: int = Field(default=1, description="Version counter for cache invalidation")

    # Semantic embedding (𝐜) - computed from description
    description: str = Field(description="Natural language description for embedding")
    embedding: list[float] | None = Field(default=None, description="Pre-computed 256-dim embedding")

    # Discrete capabilities (𝐬) - for exact matching and UMES
    capabilities: list[str] = Field(description="Required capabilities (e.g., 'documents:read')")

    # Policy flags (𝐩) - UMES entitlement requirements
    required_entitlements: list[str] = Field(default_factory=list, description="UMES entitlements required")

    # Resource hints (𝐫) - for cost-aware routing
    estimated_latency_ms: int = Field(default=100, description="Expected latency in ms")
    cost_tier: Literal["free", "low", "medium", "high"] = Field(default="low")

    # Behavioral spec (𝐞) - operational constraints
    idempotent: bool = Field(default=True, description="Safe to retry")
    side_effects: Literal["none", "read", "write", "delete"] = Field(default="read")


# Example profiles
TOOL_PROFILES: dict[str, ToolCapabilityProfile] = {
    "delete_knowledge_pack": ToolCapabilityProfile(
        name="delete_knowledge_pack",
        description="Delete a knowledge pack from HexGraph. Removes all associations and optionally deletes documents.",
        capabilities=["knowledge_packs:delete"],
        required_entitlements=["admin:system"],
        estimated_latency_ms=500,
        cost_tier="medium",
        idempotent=False,
        side_effects="delete",
    ),
    "query": ToolCapabilityProfile(
        name="query",
        description="Execute a HiRAG query against documents with multiple retrieval modes.",
        capabilities=["queries:execute", "documents:read"],
        required_entitlements=["documents:read"],
        estimated_latency_ms=200,
        cost_tier="low",
        idempotent=True,
        side_effects="read",
    ),
    "search_requirements": ToolCapabilityProfile(
        name="search_requirements",
        description="Search regulatory requirements within a framework. Returns verbatim legal text with citations.",
        capabilities=["regulatory:read"],
        required_entitlements=["regulatory:read"],
        estimated_latency_ms=150,
        cost_tier="low",
        idempotent=True,
        side_effects="read",
    ),
}

Semantic Tool Search (FoA-Style)

import numpy as np
from sentence_transformers import SentenceTransformer

class ToolRegistry:
    """FoA-inspired semantic tool registry with UMES policy enforcement."""

    def __init__(self):
        self.encoder = SentenceTransformer('nomic-ai/nomic-embed-text-v1.5', trust_remote_code=True)
        self.profiles: dict[str, ToolCapabilityProfile] = {}
        self.embeddings: np.ndarray | None = None
        self.tool_names: list[str] = []

    def register(self, profile: ToolCapabilityProfile):
        """Register a tool with pre-computed embedding."""
        if profile.embedding is None:
            profile.embedding = self.encoder.encode(profile.description).tolist()
        self.profiles[profile.name] = profile
        self._rebuild_index()

    def _rebuild_index(self):
        """Rebuild HNSW index (simplified: numpy array for now)."""
        self.tool_names = list(self.profiles.keys())
        self.embeddings = np.array([
            self.profiles[name].embedding
            for name in self.tool_names
        ])

    def search(
        self,
        query: str,
        user_capabilities: set[str],
        limit: int = 5,
        include_cost: bool = True,
    ) -> list[dict]:
        """
        FoA-style semantic search with policy enforcement.

        Ξ± = sim(query, tool) Β· 𝕀[required βŠ† granted] Β· cost_penalty
        """
        # Embed query
        query_embedding = self.encoder.encode(query)

        # Compute semantic similarities
        similarities = np.dot(self.embeddings, query_embedding)

        results = []
        for i, tool_name in enumerate(self.tool_names):
            profile = self.profiles[tool_name]

            # Policy gate: 𝕀[required βŠ† granted]
            required = set(profile.required_entitlements)
            if not required.issubset(user_capabilities):
                continue  # Hard gate - not entitled

            # Semantic score
            semantic_score = float(similarities[i])

            # Cost penalty (optional)
            cost_penalty = 1.0
            if include_cost:
                cost_multipliers = {"free": 1.0, "low": 0.95, "medium": 0.85, "high": 0.7}
                cost_penalty = cost_multipliers.get(profile.cost_tier, 1.0)

            # Final score
            final_score = semantic_score * cost_penalty

            results.append({
                "name": tool_name,
                "description": profile.description,
                "capabilities": profile.capabilities,
                "score": round(final_score, 4),
                "latency_ms": profile.estimated_latency_ms,
                "side_effects": profile.side_effects,
            })

        # Sort by score descending
        results.sort(key=lambda x: x["score"], reverse=True)
        return results[:limit]

Usage Example

# Agent searches for tools
registry = ToolRegistry()
for profile in TOOL_PROFILES.values():
    registry.register(profile)

# User with limited entitlements
user_caps = {"regulatory:read", "documents:read"}

# Search: "I need to find GDPR data retention requirements"
results = registry.search(
    query="find GDPR data retention requirements",
    user_capabilities=user_caps,
    limit=3,
)
# Returns: [search_requirements, query, get_requirement]
# Does NOT return: delete_knowledge_pack (user lacks admin:system)

Architecture Comparison

Aspect Pack System FoA-Style Semantic
Discovery Know pack names Natural language query
Matching Exact pack membership Semantic similarity + policy gate
UMES Integration Pack-level grants Capability-level gates
Scalability O(packs) O(log n) with HNSW
Context efficiency Load N tools per pack Load top-K matches
Future-proof Poor Excellent

Recommendation: Capability-Based (Implemented)

We implemented a simpler capability-based system instead of full FoA semantic search:

Rationale:

  1. UMES is the priority - Capabilities map directly to entitlements
  2. Keeps what works - Pack system unchanged for context management
  3. No ML dependencies - Simpler ops, no embedding service required
  4. Clear upgrade path - Can add semantic search later if needed
  5. 32 tools is manageable - Semantic search overkill for current scale

What We Implemented:

  • 12 capabilities: regulatory:read, documents:read/write/delete, queries:execute, conversations:read/manage, collections:manage, views:manage, admin:read, jobs:manage, knowledge_packs:delete
  • 4 predefined roles: viewer, analyst, editor, admin (hierarchical)
  • filter_tools_by_capabilities() function for UMES integration
  • Backward compatible: no restrictions if UMES not configured

Sources

  • arXiv:2509.20175 - Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
  • arXiv HTML - Full paper with diagrams
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment