To run the OpenTelemetry demo using Kieker, run the following steps:
- Start the kieker listener:
git clone -b GH-13-GetOERFromProto git@github.com:kieker-monitoring/kieker.git
cd kieker/
cd tools/otel-transformer
../../gradlew assemble
cd build/distributions
unzip otel-transformer-2.0.3-SNAPSHOT.zip
cd otel-transformer-2.0.3-SNAPSHOT/
bin/otel-transformer -lp 9000 -standard OPENTELEMETRY
(leave this process running, and stop via Ctrl + C after running the demo)
- Prepare the OpenTelemetry demo:
git clone https://github.com/open-telemetry/opentelemetry-demo.git - Edit the repo (go to
opentelemetry-demofirst):
- Edit
src/otel-collector/otelcol-config-extras.yml: Add (replacing $MYIP by your IP):
exporters:
otlp/example:
endpoint: http://$MYIP:9000
tls:
insecure: true
service:
pipelines:
traces:
exporters: [spanmetrics, otlp/example]
- Edit
docker-compose.yml: Add- _JAVA_OPTIONS=-Dotel.instrumentation.methods.include=oteldemo.AdService[main,getInstance,getAdsByCategory,getRandomAds,createAdsMap,start,stop,blockUntilShutdown];oteldemo.AdService.AdServiceImpl[getAds];oteldemo.problempattern.CPULoad[getInstance,execute,spawnLoadWorkers,stopWorkers]to the AdService
- Run the demo (in
opentelemetry-demo)
- Run
make start - Do anything you'd like to do with the interface
- Run
docker compose down -v
- Visualize the results within Kieker (replacing
$KIEKERPATHwith the path the listener showed you):
cd tools/trace-analysis
../../gradlew assemble
cd build/distributions
unzip trace-analysis-2.0.3-SNAPSHOT.zip
cd trace-analysis-2.0.3-SNAPSHOT/
mkdir graphs
bin/trace-analysis \
-i $KIEKERPATH \
-o graphs \
--plot-Deployment-Component-Dependency-Graph responseTimes-ms \
--plot-Deployment-Operation-Dependency-Graph responseTimes-ms \
--plot-Aggregated-Deployment-Call-Tree \
--plot-Aggregated-Assembly-Call-Tree
cd graphs
for file in *.dot; do dot -Tpng $file -o $file.png; done
After looking deeper into this, there is a conceptual problem here.
Kieker assumes that we have operation call traces, which means that the calls are synchronous - once an operation call has been finished, we cannot go back to it. OpenTelemetry on the other hand has potentially concurrent calls, that in some cases cannot be put in a serial order.
One example happens in one trace:
ListProductsdoes something, in parallel, a lot ofGetProductcalls are done, but later,ListProductscomes back with another call that has been done later (in trace8339157054149764712). Since we do not know before when this kind of additional data is read, it is hard to create Kieker traces from this.The situation looks like this:
I see two options to handle this:
MessageTracefrom them, potentially allowing most of the visualizations we had before. Potentially we could build onAbstractTraceEvent. Would be more sustainable, but I do not know whether we need this kind of comprehensive rewrite (would require rewriting significant parts of the analysis pipeline, as far as I see).@shinhyungyang Let's discuss during the Kieker meeting.