Skip to content

Instantly share code, notes, and snippets.

@r33drichards
Created December 17, 2025 17:23
Show Gist options
  • Select an option

  • Save r33drichards/ddc5a7860d82a3a24b47596a629ea20c to your computer and use it in GitHub Desktop.

Select an option

Save r33drichards/ddc5a7860d82a3a24b47596a629ea20c to your computer and use it in GitHub Desktop.
CUA SDK Telemetry - Testing & Deployment Guide

CUA SDK Telemetry - Testing & Deployment Guide

Testing Before Deployment

1. Test CUA SDK Locally

# Clone and checkout the branch
cd /home/alpine/cua
git checkout feat/otel-sentry-core

# Install core with telemetry dependencies
cd libs/python/core
pip install -e ".[telemetry]"

# Test agent instrumentation
git checkout feat/otel-sentry-agent
cd ../agent
pip install -e ".[telemetry]"

# Test computer instrumentation
git checkout feat/otel-sentry-computer
cd ../computer
pip install -e ".[telemetry]"

2. Quick Local Test Script

# test_telemetry.py
import os

# Point to local collector or production endpoint
os.environ["CUA_OTEL_ENDPOINT"] = "https://otel.cua.ai"  # or localhost:4318 for local testing

from core.telemetry import record_operation, record_error, is_otel_enabled
import time

print(f"OTEL enabled: {is_otel_enabled()}")

# Test recording metrics
start = time.time()
time.sleep(0.1)  # Simulate operation
record_operation(
    operation="test.operation",
    duration_seconds=time.time() - start,
    status="success"
)

record_error(error_type="TestError", operation="test.operation")
print("Metrics recorded successfully")

3. Test with Local OTEL Collector (Optional)

# Run a local OTEL collector for testing
docker run -p 4318:4318 -p 4317:4317 \
  otel/opentelemetry-collector-contrib:latest

# Then set your endpoint
export CUA_OTEL_ENDPOINT="http://localhost:4318"

4. Verify Metrics in Grafana

After running test operations, check:

  • Grafana: https://grafana.cua.ai
  • Query: cua_sdk_operations_total or cua_sdk_operation_duration_seconds_bucket

Deployment

Part A: Deploy CUA SDK (PyPI)

Merge Order (dependencies matter):

1. feat/otel-sentry-core  → merge first
2. feat/otel-sentry-agent → merge after core
3. feat/otel-sentry-computer → merge after core

Publish to PyPI (after merging):

# Option 1: Manual workflow dispatch (GitHub Actions UI)
# Go to Actions → "Publish Core Package" → Run workflow → Enter version

# Option 2: Git tag trigger
git tag core-v0.1.10
git push origin core-v0.1.10

# Then agent and computer
git tag agent-v<version>
git push origin agent-v<version>

git tag computer-v<version>
git push origin computer-v<version>

Part B: Deploy Cloud Dashboard (NixOS)

Option 1: Automatic (after merge to main)

# Merge the PR
gh pr merge 562 --merge

# The NixOS instance auto-rebuilds from main via timer
# Or trigger manually:
ssh -i ~/.ssh/rw.pem root@35.92.213.109 "systemctl start nixos-rebuild.service"

Option 2: Test Branch First

# Deploy from branch before merging
ssh -i ~/.ssh/rw.pem root@35.92.213.109 \
  "rebuild --flake 'github:trycua/cloud/feat/cua-sdk-dashboard?dir=nixos/alertmanager#alertmanager'"

Verify deployment:

# Check services are running
ssh -i ~/.ssh/rw.pem root@35.92.213.109 \
  "systemctl status prometheus alertmanager grafana --no-pager"

# Verify alert rules loaded
ssh -i ~/.ssh/rw.pem root@35.92.213.109 \
  "curl -s http://localhost:9090/api/v1/rules | jq '.data.groups[].rules[].name' | grep -i cua"

# Verify dashboard exists
ssh -i ~/.ssh/rw.pem root@35.92.213.109 \
  "curl -s http://localhost:3000/api/search | jq '.[].title' | grep -i cua"

Post-Deployment Verification Checklist

Check Command/URL
Dashboard visible https://grafana.cua.ai → Search "CUA SDK"
Alert rules active https://grafana.cua.ai/alerting/list
Metrics flowing Query cua_sdk_operations_total in Grafana
Sentry receiving Check Sentry project for test errors
OTEL endpoint curl -X POST https://otel.cua.ai/v1/metrics

Disable Telemetry (if needed)

For users who want to opt-out:

export CUA_TELEMETRY_DISABLED=true

Pull Requests

CUA Repository:

  • PR #661 - feat/otel-sentry-core - Core OTEL/Sentry modules
  • PR #662 - feat/otel-sentry-agent - Agent callback instrumentation
  • PR #663 - feat/otel-sentry-computer - Computer interface instrumentation

Cloud Repository:

  • PR #562 - feat/cua-sdk-dashboard - Grafana dashboard + alert rules

Metrics Implemented (Four Golden Signals)

Signal Metric Type
Latency cua_sdk_operation_duration_seconds Histogram
Traffic cua_sdk_operations_total Counter
Errors cua_sdk_errors_total Counter
Saturation cua_sdk_concurrent_operations Gauge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment