SSE Event Stream Performance Analysis

Date: February 8, 2026 Runtime: Bun 1.3.9 (darwin arm64) Method: Custom stress test suite using Bun's built-in --cpu-prof-md and --heap-prof-md profiling tools

Executive Summary

We profiled the SSE (Server-Sent Events) pipeline end-to-end across 5 dimensions to identify bottlenecks and validate the architecture under load. The in-process pipeline is exceptionally well-optimized — capable of sustaining 1.9M events/sec with zero memory leaks. We identified 3 concrete improvements in the I/O layer that would improve reconnection latency and reduce redundant work during fan-out.

Architecture Under Test

eventsService.publish()
  → PostgreSQL storage + Redis PUBLISH
  → Fan-out: publishToUserChannel() → Promise.all(users.map(redis.publish))

Client connection:
  → getTopicObservable(topic)      # 1 Redis subscription per topic
    → formatSSEMessage()           # Format ONCE before share()
    → share({ resetOnRefCountZero: true })  # Fan to N subscribers
  → merge(eventStream$, heartbeat$)
  → createObservableSSEResponse()  # ReadableStream for HTTP

Key architectural wins already in place:

Format-before-share: formatSSEMessage runs once per event, not once per subscriber
Shared heartbeat: Single interval(2000) timer for all connections (not N timers)
Automatic cleanup: share({ resetOnRefCountZero: true }) prevents observable leaks

Benchmark Results

1. Pipeline Throughput

Measures event delivery through the RxJS observable pipeline (mock Redis, isolating in-process cost).

Subscribers	Events/sec (per sub)	Total Deliveries/sec	Latency/event
1	1,894,388	1,894,388	0.53 us
10	1,767,696	17,676,964	0.57 us
100	963,121	96,312,113	1.04 us
1,000	63,629	63,629,422	15.72 us

Finding: Per-subscriber throughput holds steady up to 100 subscribers, confirming the format-before-share optimization works. At 1,000 subscribers, per-event cost rises to ~16us — still far below the ~1ms Redis RTT that dominates in production.

2. Fan-Out Scaling (Redis PUBLISH to N users)

Simulates the publishToUserChannel pattern: one event published to N user channels.

Users	Avg Time/Event	Events/sec
1	0.002 ms	518,360
10	0.003 ms	420,389
50	0.007 ms	147,956
100	0.013 ms	77,029
500	0.057 ms	17,620
1,000	0.103 ms	9,744

Finding: Fan-out scales linearly. At 500 users, each event takes ~57us in-process. With real Redis (~0.5ms RTT per PUBLISH), this becomes the dominant cost. Batching with MSET or Redis pipelines could reduce this.

3. Connection Churn (Registry Operations)

Measures SSEConnectionRegistry register/unregister throughput (5 Redis commands per register, 4 per unregister).

Operation	Throughput
Sequential register	349,508 ops/sec
Sequential unregister	461,823 ops/sec
Concurrent register (100)	373,948 ops/sec
Churn cycle (register + unregister)	288,781 cycles/sec

Finding: Connection management is not a bottleneck. Even with 5 pipelined Redis commands per registration, throughput exceeds 280K ops/sec.

4. Memory Leak Detection

Tests the RxJS observable lifecycle for leaks under sustained load.

Scenario	Heap Growth
500 subscribe/unsubscribe cycles	0.50 MB
100 subscribers x 10,000 events	0.06 MB

Finding: No memory leaks. share({ resetOnRefCountZero: true }) correctly cleans up Redis subscriptions when the last subscriber disconnects. Heap growth is negligible even after 500 full lifecycle cycles.

5. Formatting Performance (`formatSSEMessage`)

Measures JSON.stringify + SSE framing at different payload sizes.

Payload Size	Events/sec (10K batch)	Avg/event
Small (100B)	4,574,128	0.22 us
Medium (1KB)	2,688,082	0.37 us
Large (10KB)	580,663	1.72 us

Finding: JSON.stringify is the cost, not SSE framing. At 10KB payloads, throughput drops ~8x. For large payloads, consider pre-serializing at the publish site to avoid redundant stringify calls.

Actionable Improvements Found

1. Unbounded replay query (High Impact)

File: apps/api/src/redis/events.service.ts:722-727

// Current — no LIMIT clause
const events = await db
  .select()
  .from(schema.events)
  .where(and(...conditions))
  .orderBy(schema.events.pk)

If a client reconnects with a stale lastEventId, this query returns every event since that ID with no upper bound. A client offline for hours could trigger a query returning thousands of rows, causing latency spikes on reconnect.

Fix: Add .limit(100) (or a configurable cap). Clients that miss more than 100 events should full-refresh via the /hydrate endpoint anyway.

2. Redundant `JSON.stringify` in fan-out (Medium Impact)

File: apps/api/src/redis/events.service.ts:480-491

// Current — stringify happens inside the loop (once per user)
usersWithAccess.flatMap((userId: string) => {
  const message = JSON.stringify(eventData)  // ← repeated N times
  return [redisPublisher.publish(userChannel, message), ...]
})

JSON.stringify(eventData) is called once per user inside flatMap. The payload is identical for all users. Hoisting it above the loop eliminates N-1 redundant serializations.

Fix: Move const message = JSON.stringify(eventData) before the flatMap.

3. Serial replay across topics (Low Impact)

File: apps/api/src/utils/sse.manager.ts:259-264

// Current — sequential database queries
for (const topicItem of topics) {
  const recentEvents = await eventsService.getRecentEvents(topicItem, lastEventId)
  ...
}

When a client subscribes to multiple topics, replay queries execute sequentially. These are independent database queries that could run in parallel.

Fix: Use Promise.all to parallelize the replay queries.

Profiling Tools Reference

Bun 1.3.7+ includes built-in profiling that outputs human-readable markdown:

# CPU profile — hot functions, call tree, file breakdown
bun --cpu-prof-md apps/api/src/utils/sse.stress-test.ts

# Heap profile — top types by retained size, largest objects
bun --heap-prof-md apps/api/src/utils/sse.stress-test.ts

# Programmatic profiling (node:inspector/promises API)
# Outputs Chrome DevTools-compatible .cpuprofile files

Dual-Mode Test Pattern

The stress test file runs as both a bun:test suite (CI assertions) and a standalone script (profiling). Key discovery: import.meta.main is true in both bun test and bun run, so we use a try/catch on describe() — which throws synchronously outside the test runner — to detect mode.

# CI mode (assertions)
cd apps/api && bun test ./src/utils/sse.stress-test.ts

# Profiling mode (markdown output)
bun --cpu-prof-md apps/api/src/utils/sse.stress-test.ts

Conclusion

The SSE pipeline architecture is sound. The format-before-share pattern, shared heartbeat, and automatic cleanup via RxJS share() are all working correctly and performantly. The three improvements identified are in the I/O layer — an unbounded database query, a redundant serialization in the hot path, and serial queries that could be parallel. None are critical, but the replay query limit (#1) is worth prioritizing as a safety measure against reconnection storms.

matthew-gerstman/sse-performance-report.md

Select an option

No results found

Select an option

No results found

SSE Event Stream Performance Analysis

Executive Summary

Architecture Under Test

Benchmark Results

1. Pipeline Throughput

2. Fan-Out Scaling (Redis PUBLISH to N users)

3. Connection Churn (Registry Operations)

4. Memory Leak Detection

5. Formatting Performance (`formatSSEMessage`)

Actionable Improvements Found

1. Unbounded replay query (High Impact)

2. Redundant `JSON.stringify` in fan-out (Medium Impact)

3. Serial replay across topics (Low Impact)

Profiling Tools Reference

Dual-Mode Test Pattern

Conclusion

matthew-gerstman/sse-performance-report.md

SSE Event Stream Performance Analysis

Executive Summary

Architecture Under Test

Benchmark Results

1. Pipeline Throughput

2. Fan-Out Scaling (Redis PUBLISH to N users)

3. Connection Churn (Registry Operations)

4. Memory Leak Detection

5. Formatting Performance (formatSSEMessage)

Actionable Improvements Found

1. Unbounded replay query (High Impact)

2. Redundant JSON.stringify in fan-out (Medium Impact)

3. Serial replay across topics (Low Impact)

Profiling Tools Reference

Dual-Mode Test Pattern

Conclusion

5. Formatting Performance (`formatSSEMessage`)

2. Redundant `JSON.stringify` in fan-out (Medium Impact)