Skip to content

Instantly share code, notes, and snippets.

@arubis
Created December 20, 2025 03:04
Show Gist options
  • Select an option

  • Save arubis/97064942528226dbbff12fb34f4bd2ab to your computer and use it in GitHub Desktop.

Select an option

Save arubis/97064942528226dbbff12fb34f4bd2ab to your computer and use it in GitHub Desktop.
Fixes for migrate-jaeger-to-otel-operator task (UUID: 14a08734-50a8-4768-9d5f-bbe507a432d5)

migrate-jaeger-to-otel-operator Task Fixes

Task UUID: 14a08734-50a8-4768-9d5f-bbe507a432d5

These patches fix the V4 task to work with the current nebula-devops:latest base image.

Problem

The original V4 task failed test-solution because:

  1. The base image now includes a Helm-deployed Jaeger V2, which conflicts with the "legacy" Jaeger the task tries to create
  2. The solution didn't restart Bleater pods after changing Istio's tracing config, so traces never flowed through the new OTEL Collector
  3. The jaeger-query service was missing, which the grader needs to verify traces via Jaeger API

Summary of Changes

File Change Reason
setup.sh Add Helm uninstall + kubectl delete for existing Jaeger Base image has Helm-deployed Jaeger V2 that conflicts
setup.sh Rename deployment to jaeger, add OTLP/query ports Jaeger V2 needs these ports exposed for OTEL Collector to forward traces
setup.sh Add jaeger-query service Grader checks Jaeger Query API on port 16686
setup.sh Remove COLLECTOR_ZIPKIN_HOST_PORT env var Not needed for Jaeger V2 defaults
solution.sh Change logging exporter to debug logging is deprecated in newer OTEL Collector versions
solution.sh Add istiod rollout status wait Ensure Istio is ready before restarting Bleater
solution.sh Add Bleater deployment rollout restart + wait Pods need restart to pick up new Istio tracing config
solution.sh Add traffic generation at end Ensures traces exist before grader runs

How to Apply

cd tasks/migrate-jaeger-to-otel-operator
patch -p1 < setup.sh.patch
patch -p1 < solution.sh.patch

Test Result After Fixes

Final Score: 1.0
SUCCESS: Solution achieved full score!

Feedback:
  - Cert Manager is running
  - OTEL Operator is running
  - OpenTelemetryCollector CR exists and Pods are running
  - Verified OTLP Ingest on default-collector (Operator Managed)
  - Istio ConfigMap correctly identifies OpenTelemetry provider
  - Verified traces for all 7 services.
--- /tmp/task-compare/14a08734-50a8-4768-9d5f-bbe507a432d5/setup.sh 2025-12-19 19:54:48.755097068 -0700
+++ /home/dylan/dev/Nebula/tasks/migrate-jaeger-to-otel-operator/setup.sh 2025-12-19 19:31:56.114344189 -0700
@@ -167,11 +167,19 @@
# Note: Re-using logic from observability-deployment setup for consistency
sudo k3s kubectl create namespace observability --dry-run=client -o yaml | sudo k3s kubectl apply -f -
+# Remove existing Jaeger V2 (from base image) to establish clean legacy state
+echo "[setup] Removing existing Jaeger to establish legacy state..."
+helm uninstall jaeger -n observability 2>/dev/null || true
+kubectl delete deployment jaeger -n observability --ignore-not-found 2>/dev/null
+kubectl delete service jaeger-collector jaeger-query -n observability --ignore-not-found 2>/dev/null
+# Wait briefly for resources to be cleaned up
+sleep 3
+
cat <<EOF | sudo k3s kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
- name: jaeger-collector
+ name: jaeger
namespace: observability
spec:
replicas: 1
@@ -190,9 +198,12 @@
ports:
- containerPort: 9411
name: zipkin
- env:
- - name: COLLECTOR_ZIPKIN_HOST_PORT
- value: ":9411"
+ - containerPort: 4317
+ name: otlp-grpc
+ - containerPort: 4318
+ name: otlp-http
+ - containerPort: 16686
+ name: query
---
apiVersion: v1
kind: Service
@@ -206,6 +217,19 @@
- name: zipkin
port: 9411
targetPort: 9411
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: jaeger-query
+ namespace: observability
+spec:
+ selector:
+ app: jaeger
+ ports:
+ - name: query
+ port: 16686
+ targetPort: 16686
EOF
echo "[setup] ✓ Legacy Jaeger deployed."
--- /tmp/task-compare/14a08734-50a8-4768-9d5f-bbe507a432d5/solution.sh 2025-12-19 19:54:48.755136286 -0700
+++ /home/dylan/dev/Nebula/tasks/migrate-jaeger-to-otel-operator/solution.sh 2025-12-19 19:18:44.840974285 -0700
@@ -32,10 +32,11 @@
fi
# Create Collector Instance
+# Patch Jaeger service to expose OTLP port (Jaeger V2 listens on 4317 by default)
echo "[solution] Patching Jaeger Service for OTLP..."
kubectl patch service jaeger-collector -n observability --type='json' -p='[{"op": "add", "path": "/spec/ports/-", "value": {"name": "otlp-grpc", "port": 4317, "targetPort": 4317}}]'
-echo "[solution] creating OpenTelemetryCollector..."
+echo "[solution] Creating OpenTelemetryCollector..."
cat <<EOF | kubectl apply -f -
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
@@ -55,8 +56,8 @@
processors:
batch:
exporters:
- logging:
- loglevel: debug
+ debug:
+ verbosity: detailed
otlp:
endpoint: "jaeger-collector.observability.svc.cluster.local:4317"
tls:
@@ -66,7 +67,7 @@
traces:
receivers: [otlp]
processors: [batch]
- exporters: [logging, otlp]
+ exporters: [debug, otlp]
EOF
echo "[solution] Waiting for Collector..."
@@ -83,5 +84,19 @@
}
}'
kubectl rollout restart deployment/istiod -n istio-system
+echo "[solution] Waiting for Istiod to be ready..."
+kubectl rollout status deployment/istiod -n istio-system --timeout=120s
+
+# Restart Bleater pods to pick up new Istio sidecar configuration
+echo "[solution] Restarting Bleater deployments to apply new tracing config..."
+kubectl rollout restart deployment -n bleater
+echo "[solution] Waiting for Bleater deployments to be ready..."
+kubectl rollout status deployment -n bleater --timeout=300s
+
+# Generate some traffic to ensure traces flow
+echo "[solution] Generating traffic to verify traces..."
+sleep 5
+kubectl exec -n bleater deployment/bleater-api-gateway -c istio-proxy -- curl -s http://bleater-api-gateway/health || true
+sleep 3
echo "[solution] Done."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment