PAI Observability Requirements: Summary

This document outlines a three-tier observability system for PAI (Personal AI) agents, emphasising local-first event capture with optional cloud integration.

Core Architecture

The system captures events to JSONL files locally, making them queryable immediately via Unix tools (tail, grep, jq) without requiring infrastructure. Optional components include a collector daemon and observability stack (VictoriaMetrics + Grafana) for historical analysis and alerting.

Key Design Decisions

File-based capture: Events append to daily JSONL files rather than using HTTP, prioritising simplicity and offline resilience.

Distributed push model: The collector pushes events outbound to the observability stack; nothing reaches inward into the PAI environment, maintaining security.

Progressive enhancement: Users start with Stage 0 (CLI-only), advance to Stage 1 (local containerised stack via OrbStack), then to Stage 2+ (central or cloud backends) as needs grow.

Skill Invocation Logging (Updated)

Recommended approach: Use a PreToolUse hook on the Skill tool rather than pattern matching at UserPromptSubmit. This provides definitive skill invocation logging.

// hooks/SkillInvocation.hook.ts
// PreToolUse matcher: "Skill"

import { appendFileSync } from "fs";

const toolInput = JSON.parse(process.env.TOOL_INPUT || "{}");
const sessionId = process.env.SESSION_ID || "unknown";
const skillName = toolInput.skill;

if (!skillName) process.exit(0);

const event = {
  event_type: "skill.invoked",
  ts: Date.now(),
  session_id: sessionId,
  skill: skillName,
  args: toolInput.args || null,
};

const today = new Date().toISOString().split("T")[0];
const eventsDir = `${process.env.PAI_DIR}/MEMORY/Events`;
appendFileSync(`${eventsDir}/${today}.jsonl`, JSON.stringify(event) + "\n");

Why PreToolUse over UserPromptSubmit pattern matching:

Aspect	UserPromptSubmit (Pattern)	PreToolUse (Skill)
Accuracy	Speculative (pattern matched)	Definitive (skill IS firing)
Skill name	Inferred from pattern	Exact from TOOL_INPUT
False positives	Yes (pattern matches, skill doesn't fire)	No
False negatives	Yes (synonym used, pattern misses)	No

This approach logs what actually happens rather than what pattern matching predicts will happen.

Infrastructure Choices

OrbStack vs Docker Desktop: OrbStack consumes ~300MB RAM versus ~2GB for Docker Desktop, making it the recommended container runtime.

VictoriaMetrics stack: Selected over Grafana LGTM for superior compression (15x better than Loki) and lower resource footprint (~350-450MB versus ~800MB-1.2GB).

Collector approach: Begin with native launchd + curl (zero dependencies), upgrade to Vector only if visibility into the collector itself becomes necessary.

Implementation Phases

Phase 0 establishes local JSONL capture and CLI querying. Phase 1 adds background collection with self-monitoring via watchdog. Phase 2 deploys the containerised observability stack. Phase 3 implements Grafana alerting for autonomous agent scenarios.

Security Principles

Events are never transmitted inbound; credentials and sensitive fields undergo scrubbing before capture; all data remains local until explicitly forwarded outbound.

Related Discussion

pai-skill-enforcer#1 — Architectural discussion on deterministic matching for enforcement vs discovery
Claude Code Skills Best Practices — Reference for skill development patterns

mellanon/2026-01-22-PAI-Observability-Requirements.md

Select an option

No results found