You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Session Date: 2026-02-13
Start/End: 20:20 - 21:49 GMT+7
Duration: ~90 min
Focus: Web server v3 upgrade attempt, fallback to v2, R2 firmware CDN, GNSS/GPS integration
Type: Feature + Infrastructure
Session Summary
Session #2 today (session #10 in the FloodBoy Oracle project). Continued from the PPPoS hybrid handoff (fa2f955). Attempted web_server v3 upgrade, discovered it crashes PPPoS on ESP32-C3, fell back to v2 with local: true. Built the R2 firmware CDN pipeline (bucket creation + worker deploy + upload). Integrated GNSS/GPS via modem CMUX with auto-enable on boot. Seven compile-flash-test cycles in 90 minutes.
compile -> md5 -> manifest.json -> PUT /fw/floodboy/sim7600e-c3/* -> OTA pull
Architecture Decisions
v2 over v3: v3 local assets (~77KB compressed) block main loop 300ms+ during decompression, starving PPPoS UART. v2 is lighter and safe. v3 is fine for WiFi-only devices.
R2 bucket for firmware CDN: Created floodboy-fw bucket on Cloudflare R2, bound to the dustboy-health worker. PUT with API key, GET public. 60s cache.
GNSS via CMUX virtual UART: SIM7600E handles GPS internally. CMUX multiplexes NMEA data alongside PPPoS on the same physical UART. No extra GPIO pins needed.
120s HTTP timeout: Default timeout too short for 1.2MB over cellular PPPoS (~10KB/s). 120s gives comfortable margin.
AI Diary
This session was a masterclass in "try, fail, adapt." I started with the plan to upgrade to web_server v3 with its nice HA-styled UI and entity grouping. First obstacle: the ESPHome docs had changed the entity sorting syntax and I used the old form (web_server_sorting_group instead of nested web_server: { sorting_group_id: }). Fixed that, but then hit a fundamental architectural wall — v3 loads its UI from a CDN, which doesn't work on an isolated WiFi AP. Added local: true to embed the assets, but that was the real killer: decompressing 77KB of web assets on the ESP32-C3 blocked the main loop long enough to corrupt the PPPoS UART stream. The modem couldn't re-enter PPP mode and spiraled into a crash loop with UART garbage.
The fallback to v2 was humbling but correct. Sometimes the simpler solution is the right one for constrained hardware. Then came an unexpected bonus: the R2 firmware CDN pipeline fell into place naturally — create bucket, deploy worker, upload files, OTA works. And the GPS integration through modem CMUX was surprisingly clean — no extra pins, no extra UART, just a virtual channel through the same multiplexed connection. Seven compile-flash-test cycles in 90 minutes. Each failure taught something concrete, each fix was verified on real hardware within minutes. The tight feedback loop between code, compile, flash, and observe is what makes embedded development both frustrating and deeply satisfying.
What Went Well
Fast iteration: 7 compile-flash cycles in 90 minutes
R2 CDN pipeline worked first try after bucket creation
GNSS integration through CMUX was clean — no hardware changes needed
MQTT telemetry continued flowing through all changes (except v3 crash)
Oracle learning captured immediately when v3 crash discovered
What Could Improve
Should have checked v3 asset size before attempting local embedding
Could have tested v3 on a WiFi-only device first to isolate the CDN vs local issue
The millis() ambiguity with TinyGPSPlus was a surprise — need a "known conflicts" list for GPS libraries
Blockers & Resolutions
Blocker
Resolution
Time Lost
v3 wrong syntax
Context7 docs lookup
~3 min
v3 CDN blank page
Added local: true
~2 min
v3 local PPPoS crash
Fell back to v2
~10 min
R2 bucket not found
wrangler r2 bucket create
~2 min
OTA timeout -28679
timeout: 120s
~5 min
millis() ambiguity
esphome::millis()
~3 min
Honest Feedback
Three friction points from this session:
1. ESPHome v3 web_server is a trap for constrained devices. The docs don't mention that local: true embeds ~77KB of compressed assets that block the main loop during decompression. On an ESP32-C3 with PPPoS, this is fatal. The error (pppos_input_tcpip failed with -1) gives zero indication that the web server is the culprit. I only figured it out because the logger warning said "web_server took 12777ms for an operation." Without that breadcrumb, I would have spent much longer debugging.
2. The compile-flash cycle is still too manual. SCP to white.local, then SSH + esptool — it works but it's 4 commands every time. The /esphome-dev skill exists but I didn't use it because the tmux workflow felt like overkill for quick iterations. A one-liner flash alias would save 30 seconds per cycle x 7 = 3.5 minutes.
3. R2 bucket creation should be in the infrastructure-as-code. I had to discover the bucket didn't exist by trying to deploy the worker and seeing it fail. The wrangler.toml references floodboy-fw but doesn't create it. A pre-deploy check or Terraform would prevent this surprise.
Lessons Learned
v3 web_server + local: true is incompatible with PPPoS on ESP32-C3 — asset decompression blocks UART for 300ms+, killing PPPoS. Use v2.
R2 bucket must exist before wrangler deploy — worker fails with "bucket not found" if missing.
TinyGPSPlus defines its own millis() — use esphome::millis() to disambiguate.
GNSS on SIM7600E uses AT+CGPS (not AT+CGNSPWR like SIM7080) — model-specific commands.
OTA over cellular needs 120s+ timeout — default is too short for 1.2MB over PPPoS.
ESPHome v3 entity sorting syntax changed — web_server: { sorting_group_id: X }, not web_server_sorting_group: X.