This document defines a decentralized agent system using:
- A2A (Agent-to-Agent) Communication
- UPER-S Methodology (Understand, Plan, Execute, Review, Secure)
- GOAP (Goal-Oriented Action Planning) for decision making
- Hierarchical Self-Learning Architecture with feedback loops
- Candle ML Framework for neural network-based learning
- Testdata-Builder Pattern for deterministic testing
The architecture ensures autonomous coordination between agents using a shared SQLite knowledge space and structured task synchronization with continuous learning from user interactions.
- Understand: Gather context from project space, agent state, current task queue, and learned patterns.
- Plan: Determine optimal next action using GOAP with learned priorities and check for task conflicts.
- Execute: Claim and run the selected task using appropriate ML models.
- Review: Verify output, update knowledge, and adjust learned patterns.
- Secure: Apply cleanup, enforce invariants, and ensure data integrity.
GOAP agents select and sequence actions based on goals, current state, effects, and learned preferences.
- Agents have goals with dynamic priority weights learned from outcomes
- Each goal has preconditions (what must be true) and effects (what it achieves)
- Actions are selected using both static rules and learned patterns
- No centralized controller
- All agents write and read from a shared SQLite knowledge base
- Each agent must check task state and learned priorities before acting
- Agents wait, reschedule, or preempt based on priority rules
| State | Action | Agent Behavior |
|---|---|---|
| IDLE | Check conflicts + priorities | Claim if optimal |
| WAITING | Monitor heartbeat + learn patterns | Resume when optimal |
| RUNNING | Maintain heartbeat, execute task | Exclusive execution |
| STALE | Cleanup takes over | Other agent can reclaim |
| COMPLETED | Release and log for learning | Update context and goals |
# Priority-based Task Management
class TaskPriority:
CRITICAL = 100 # Deployment failures, security issues
HIGH = 75 # Test failures, build breaks
MEDIUM = 50 # Regular builds, deployments
LOW = 25 # Analysis, optimizations
BACKGROUND = 10 # Logging, metrics, cleanup
class TaskClaim:
def can_preempt(self, current_task, new_task):
priority_diff = new_task.priority - current_task.priority
min_priority_gap = self.learned_preemption_threshold
return priority_diff >= min_priority_gapgraph TB
subgraph "Architecture Understanding"
ARCHDET[Architecture Detector<br/>MVC, Microservice, Layered]
MODULAR[Module Analyzer<br/>group related files]
LAYER[Layer Detector<br/>UI, Business, Data]
BOUNDARY[Boundary Detector<br/>module interfaces]
end
subgraph "Knowledge Graph"
MODULES[(Module Graph<br/>dependencies, relationships)]
PATTERNS[(Pattern Library<br/>learned architectures)]
CONTEXT[(Context Map<br/>entity associations)]
USAGE[(Usage Patterns<br/>query→result success)]
end
ARCHDET --> PATTERNS
MODULAR --> MODULES
LAYER --> MODULES
BOUNDARY --> MODULES
// Using Candle framework for neural learning
use candle_core::{Device, Tensor, DType};
use candle_nn::{Module, Optimizer, VarBuilder};
struct SelfLearningEngine {
// Neural network for pattern recognition
pattern_net: candle_nn::Sequential,
// Context association model
context_net: candle_nn::Sequential,
optimizer: candle_nn::AdamW,
feedback_buffer: Vec<LearningSample>,
}
impl SelfLearningEngine {
fn process_feedback(&mut self, sample: &LearningSample) -> Result<()> {
// Convert feedback to tensor
let features = self.encode_feedback(sample)?;
let targets = self.compute_targets(sample)?;
// Forward pass
let predictions = self.pattern_net.forward(&features)?;
// Compute loss and backward pass
let loss = self.loss_fn(&predictions, &targets)?;
self.optimizer.backward_step(&loss)?;
// Update knowledge graph
self.update_knowledge_graph(sample, &predictions)?;
Ok(())
}
fn predict_relevance(&self, query: &Query, context: &Context) -> Result<f32> {
let features = self.encode_query_context(query, context)?;
let output = self.pattern_net.forward(&features)?;
Ok(output.get(0).unwrap.to_scalar::<f32>()?)
}
}-- Core Tables with Learning Extensions
CREATE TABLE architecture_modules (
id INTEGER PRIMARY KEY,
module_name TEXT,
module_type TEXT, -- 'feature', 'layer', 'utility'
layer_classification TEXT, -- 'presentation', 'business', 'persistence'
dependencies JSON, -- Module relationships
confidence_score REAL,
last_updated TIMESTAMP
);
CREATE TABLE learned_patterns (
id INTEGER PRIMARY KEY,
pattern_type TEXT, -- 'architectural', 'usage', 'query'
pattern_data JSON,
confidence REAL,
usage_count INTEGER,
success_rate REAL,
last_verified TIMESTAMP
);
CREATE TABLE feedback_loops (
id INTEGER PRIMARY KEY,
query_hash TEXT,
result_clicked TEXT,
dwell_time_ms INTEGER,
success_score REAL, -- -1.0 to 1.0
context_features JSON,
learned_insight TEXT,
created_at TIMESTAMP
);
CREATE TABLE model_performance (
model_name TEXT,
task_type TEXT,
architecture_pattern TEXT,
success_rate REAL,
avg_latency_ms INTEGER,
context_understanding_score REAL,
last_updated TIMESTAMP
);| Agent Type | Goal | Preconditions | Effects | Learning Component |
|---|---|---|---|---|
| Builder Agent | Build |
No active build, dependencies ready | Build artifacts created | Learns build patterns and optimizations |
| Tester Agent | Test |
Build successful, test env ready | Test results recorded | Learns test priorities and flakiness patterns |
| Deployer Agent | Deploy |
Tests passed, env available | Deployment executed | Learns deployment strategies and rollback triggers |
| Analyzer Agent | AnalyzeArchitecture |
Code changes detected | Architecture graph updated | Hierarchical pattern detection |
| Learning Agent | ProcessFeedback |
New user interactions | Knowledge graphs updated | Neural network training |
| Router Agent | RouteModel |
LLM request received | Optimal model selected | Learns model performance per context |
| Cleaner Agent | Cleanup |
Stale task detected | Resources reclaimed | Learns optimal retention policies |
class LearningGOAPAgent:
def plan_with_learning(self, goal, world_state):
# Phase 1: Architecture-aware planning
architecture_context = self.analyzer.get_architecture_context(goal)
relevant_modules = self.scope_to_modules(goal, architecture_context)
# Phase 2: Learned priority adjustment
learned_priorities = self.learning_engine.get_goal_priorities(
goal, world_state, architecture_context
)
# Phase 3: Multi-strategy action planning
candidates = self.generate_action_candidates(relevant_modules)
ranked_actions = self.rank_with_learned_patterns(candidates)
# Phase 4: Model selection based on architecture
optimal_model = self.router.select_model_for_context(
goal, architecture_context
)
return ranked_actions, optimal_model
def execute_with_feedback(self, action, model):
result = super().execute(action, model)
# Record execution for learning
feedback = ExecutionFeedback(
action=action,
model_used=model,
architecture_context=self.current_architecture_context,
outcome=result.success,
performance_metrics=result.metrics
)
self.learning_engine.record_feedback(feedback)
return result# Enhanced Model Routing with Architecture Patterns
model_routing:
planning:
complex_architecture: "gpt-4"
simple_monolith: "claude-3"
microservices: "local-llm+graph-context"
code_generation:
frontend: "claude-3"
backend: "gpt-4"
infrastructure: "local-llm"
analysis:
architectural: "gpt-4+graph-analysis"
code_quality: "claude-3"
performance: "local-llm"
fallback_strategy: "architecture_aware_fallback"class ArchitectureDetectionAgent:
def detect_architecture_patterns(self, codebase):
# Level 1: Directory and file structure analysis
module_structure = self.analyze_module_structure(codebase)
# Level 2: Dependency graph analysis
dependency_graph = self.build_dependency_graph(codebase)
# Level 3: Architectural pattern matching
patterns = self.match_known_patterns(module_structure, dependency_graph)
# Level 4: Boundary detection
boundaries = self.detect_module_boundaries(patterns, dependency_graph)
# Update knowledge graph
self.update_architecture_knowledge(patterns, boundaries)
return ArchitectureContext(
patterns=patterns,
boundaries=boundaries,
modules=module_structure,
confidence_scores=self.calculate_confidence(patterns)
)- Enhanced SQLite schema with architecture tables
- Basic hierarchical reasoning agents
- Architecture detection pipeline
- Candle framework integration
- Neural network models for pattern recognition
- Feedback collection and processing system
- Knowledge graph population and querying
- Basic self-learning loops
- Multi-level architecture understanding
- Context-aware model routing
- Advanced GOAP with learned priorities
- Performance optimization and scaling
- Advanced monitoring with architecture metrics
- Security hardening for ML components
- Load testing and optimization
- Documentation and deployment automation
graph TD
subgraph "Enhanced SQLite Knowledge Space"
AgentsTable[agents]
TasksTable[tasks + learned_priority]
ArchModules[architecture_modules]
Patterns[learned_patterns]
Feedback[feedback_loops]
ModelPerf[model_performance]
end
subgraph "Core Agent Layer"
Builder[Builder Agent]
Tester[Tester Agent]
Deployer[Deployer Agent]
Analyzer[Analyzer Agent<br/>Architecture Detection]
Learner[Learning Agent<br/>Candle ML Engine]
Router[Router Agent]
Cleaner[Cleaner Agent]
end
subgraph "Hierarchical Reasoning"
ArchDetect[Architecture Detector]
ModuleAnalyzer[Module Analyzer]
BoundaryDetect[Boundary Detector]
PatternMatcher[Pattern Matcher]
end
ArchDetect --> ArchModules
ModuleAnalyzer --> ArchModules
BoundaryDetect --> ArchModules
PatternMatcher --> Patterns
Analyzer --> ArchDetect
Analyzer --> ModuleAnalyzer
Analyzer --> BoundaryDetect
Analyzer --> PatternMatcher
Learner --> Feedback
Learner --> Patterns
Learner --> ModelPerf
Router --> ModelPerf
Router --> Patterns
Builder --> TasksTable
Tester --> TasksTable
Deployer --> TasksTable
Analyzer --> ArchModules
Learner --> Feedback
Router --> ModelPerf
classDef knowledge fill:#fff3e0,stroke:#e65100
classDef agent fill:#e3f2fd,stroke:#1565c0
classDef reasoning fill:#f3e5f5,stroke:#6a1b9a
class AgentsTable,TasksTable,ArchModules,Patterns,Feedback,ModelPerf knowledge
class Builder,Tester,Deployer,Analyzer,Learner,Router,Cleaner agent
class ArchDetect,ModuleAnalyzer,BoundaryDetect,PatternMatcher reasoning
// Candle-based learning implementation
impl LearningAgent {
pub fn continuous_learning_loop(&mut self) -> Result<()> {
loop {
// Collect new feedback
let new_samples = self.feedback_collector.collect_recent()?;
if !new_samples.is_empty() {
// Batch process for efficiency
let batch = self.create_training_batch(new_samples)?;
// Train pattern recognition network
self.pattern_trainer.train_batch(&batch)?;
// Update context association model
self.context_trainer.update_associations(&batch)?;
// Adjust architecture pattern confidence
self.architecture_analyzer.refine_patterns(&batch)?;
// Update model performance metrics
self.router.update_model_performance(&batch)?;
}
// Sleep with exponential backoff based on feedback volume
self.adaptive_sleep();
}
}
fn create_training_batch(&self, samples: Vec<FeedbackSample>) -> Result<TrainingBatch> {
let features: Vec<Tensor> = samples.iter()
.map(|s| self.encode_feedback_features(s))
.collect::<Result<_>>()?;
let targets: Vec<Tensor> = samples.iter()
.map(|s| self.encode_learning_targets(s))
.collect::<Result<_>>()?;
Ok(TrainingBatch { features, targets, samples })
}
}-- Learning progress monitoring
CREATE TABLE learning_metrics (
week INTEGER,
query_type TEXT,
architecture_pattern TEXT,
initial_accuracy REAL,
current_accuracy REAL,
learning_rate REAL,
sample_count INTEGER,
measured_at TIMESTAMP
);- Candle Framework Mastery - Efficient tensor operations and model management
- Architecture Pattern Library - Comprehensive known-pattern database
- Feedback Pipeline Reliability - Robust collection and processing
- Knowledge Graph Performance - Efficient querying of complex relationships
- Architecture Detection Accuracy: Target >90% pattern recognition
- Learning Convergence: <4 weeks to 75% accuracy from cold start
- Query Response Time: <2s for architecture-aware searches
- Model Routing Accuracy: >85% optimal model selection
- Cold Start Problem: Seed with common architecture patterns
- Overfitting: Regular validation against diverse codebases
- Performance Degradation: Continuous monitoring and rollback capability
- Knowledge Graph Bloat: Automated pruning and importance scoring
- Cross-Project Learning - Transfer learned patterns between similar projects
- Predictive Architecture - Suggest architecture improvements based on patterns
- Real-time Collaboration - Multiple agents working on large-scale refactoring
- Advanced Neural Architectures - Transformer-based code understanding
- Federated Learning - Privacy-preserving learning across organizations
✅ True Hierarchical Reasoning - Multi-level architecture understanding
✅ Continuous Self-Learning - Candle-based neural learning from feedback
✅ Architecture-Aware Coordination - Context-sensitive task execution
✅ Adaptive Model Routing - Optimal LLM selection based on architectural context
✅ Production-Ready Foundation - Testing, monitoring, and security built-in
✅ Scalable Knowledge Graph - Efficient relationship storage and querying
This evolved architecture represents a significant advancement from basic multi-agent systems to an intelligent, self-improving collective that deeply understands software architecture and continuously enhances its performance through learned experience.