Skip to content

Instantly share code, notes, and snippets.

@aronchick
Created December 22, 2025 00:21
Show Gist options
  • Select an option

  • Save aronchick/5d9391b03f9266b994c7a36043d743a4 to your computer and use it in GitHub Desktop.

Select an option

Save aronchick/5d9391b03f9266b994c7a36043d743a4 to your computer and use it in GitHub Desktop.
Testing examples for Bento Python Processor (PR #621)

Testing the WASM Python Processor

Example configs and testing guide for the Python processor (PR #621).

Prerequisites

Requirement Version Check Command
Go 1.24+ go version
wget any which wget

macOS: brew install wget if needed.

Quick Start

# From repo root:
cd /path/to/bento

# 1. Download the Python WASM runtime (~50MB)
bash internal/impl/python/scripts/install.sh

# 2. Run an example
go run -tags "x_wasm" ./cmd/bento/main.go -c internal/impl/python/examples/hello.yaml

Expected: "Hello, World!"

First run: ~30-60 seconds (Go compilation). Subsequent runs: ~1 second.

Examples

All examples use generate input so they run without piping.

File Description
hello.yaml Simple greeting
test1.yaml JSON transformation
test2.yaml Array processing (sum, count, double)
test3.yaml Message filtering (active passes)
test3b.yaml Message filtering (inactive → null)
test4.yaml Standard library imports (math, json, re)
test5.yaml E-commerce order transformation
test6.yaml Error handling (KeyError demo)
perf.yaml Cold start test (10 messages)
throughput.yaml Throughput test (1000 messages)

Run All Examples

cd /path/to/bento

for f in internal/impl/python/examples/test{1..6}.yaml; do
  echo "=== $(basename $f) ==="
  go run -tags "x_wasm" ./cmd/bento/main.go -c "$f" 2>&1 | head -5
done

Performance Tests

time go run -tags "x_wasm" ./cmd/bento/main.go -c internal/impl/python/examples/perf.yaml
time go run -tags "x_wasm" ./cmd/bento/main.go -c internal/impl/python/examples/throughput.yaml

Build Tag

The -tags "x_wasm" flag is required. Without it, the processor panics.

# Fails:
go run ./cmd/bento/main.go -c examples/hello.yaml

# Works:
go run -tags "x_wasm" ./cmd/bento/main.go -c examples/hello.yaml

Unit Tests

go test -tags "x_wasm" -v ./internal/impl/python/...

Troubleshooting

"no WASM runtime found"

Run bash internal/impl/python/scripts/install.sh first.

Errors not visible

Set logger.level: debug in config to see Python stderr.

Command hangs

You're using stdin input without piping. Use generate input or pipe data.

Python vs Bloblang

Use Python when:

  • Complex string manipulation (regex, f-strings)
  • Mathematical operations beyond basic arithmetic
  • Your team knows Python better than Bloblang

Use Bloblang when:

  • Simple field mappings
  • Performance-critical paths
  • Zero cold-start overhead needed
input:
generate:
count: 1
mapping: 'root = {"name": "World"}'
pipeline:
processors:
- python:
script: |
root = f"Hello, {this['name']}!"
output:
stdout: {}
logger:
level: error
input:
generate:
count: 10
interval: ""
mapping: 'root = {"n": count()}'
pipeline:
processors:
- python:
script: |
root = this["n"] * 2
output:
drop: {}
logger:
level: error
input:
generate:
count: 1
mapping: 'root = {"name": "Bento"}'
pipeline:
processors:
- python:
script: |
root = {
"greeting": f"Hello, {this['name']}!",
"original": this
}
output:
stdout: {}
logger:
level: error
input:
generate:
count: 1
mapping: 'root = [1, 2, 3, 4, 5]'
pipeline:
processors:
- python:
script: |
root = {
"sum": sum(this),
"count": len(this),
"doubled": [x * 2 for x in this]
}
output:
stdout: {}
logger:
level: error
input:
generate:
count: 1
mapping: 'root = {"status": "active", "id": 1}'
pipeline:
processors:
- python:
script: |
# Only keep messages with status "active"
if this.get("status") == "active":
root = this
else:
root = None # Filters out the message
output:
stdout: {}
logger:
level: error
input:
generate:
count: 1
mapping: 'root = {"status": "inactive", "id": 2}'
pipeline:
processors:
- python:
script: |
# Only keep messages with status "active"
if this.get("status") == "active":
root = this
else:
root = None # Filters out the message
output:
stdout: {}
logger:
level: error
input:
generate:
count: 1
mapping: 'root = {"value": 3.7, "json_string": "{\"nested\": true}"}'
pipeline:
processors:
- python:
imports: [math, json, re]
script: |
root = {
"ceil": math.ceil(this["value"]),
"floor": math.floor(this["value"]),
"sqrt": math.sqrt(abs(this["value"])),
"parsed_json": json.loads(this.get("json_string", "{}"))
}
output:
stdout: {}
logger:
level: error
input:
generate:
count: 1
mapping: |
root = {
"order_id": "ORD-123",
"items": [
{"name": "Widget", "price": 9.99, "quantity": 2},
{"name": "Gadget", "price": 24.99, "quantity": 1, "discount": 5}
]
}
pipeline:
processors:
- python:
script: |
# E-commerce order summary
items = this.get("items", [])
root = {
"order_id": this["order_id"],
"item_count": len(items),
"total_price": sum(item["price"] * item["quantity"] for item in items),
"item_names": [item["name"] for item in items],
"has_discount": any(item.get("discount", 0) > 0 for item in items)
}
output:
stdout: {}
logger:
level: error
input:
generate:
count: 1
mapping: 'root = {"wrong_field": 123}'
pipeline:
processors:
- python:
script: |
# This will fail if "required_field" is missing
value = this["required_field"]
root = {"value": value}
output:
stdout: {}
logger:
level: debug
input:
generate:
count: 1000
interval: ""
mapping: 'root = {"value": random_int(min: 1, max: 100)}'
pipeline:
processors:
- python:
script: |
root = {"doubled": this["value"] * 2}
output:
drop: {}
logger:
level: error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment