Durable Streams Sync Provider - Research

Overview

This document analyzes the feasibility of creating a LiveStore sync provider targeting Durable Streams, an open HTTP-based protocol for real-time sync developed by Electric.

Durable Streams Architecture

Core Concepts

Append-only streams: Ordered, replayable data logs addressed by URL
Offset-based resumption: Opaque, monotonic offsets for client-side progress tracking
HTTP-native: Works with standard HTTP semantics, CDN-cacheable for historical reads
Two framing modes:
- Byte stream (default): Raw concatenated bytes, app handles framing
- JSON mode: Server guarantees message boundaries via contentType: "application/json"

API Operations

Operation	Endpoint	Purpose
Create	`PUT /stream/{path}`	Initialize stream with content-type
Append	`POST /stream/{path}`	Add data to stream
Read	`GET /stream/{path}?offset=X`	Historical catch-up
Live poll	`GET /stream/{path}?offset=X&live=long-poll`	Wait for new data
Live SSE	`GET /stream/{path}?offset=X&live=sse`	Server-Sent Events
Delete	`DELETE /stream/{path}`	Remove stream
Metadata	`HEAD /stream/{path}`	Query stream info

Key Response Headers

Stream-Next-Offset: Opaque token for resumption (exactly-once delivery)
Cache-Control: CDN caching for historical reads
ETag: Request collapse for identical offsets

LiveStore SyncBackend Contract

interface SyncBackend<TSyncMetadata> {
  connect: Effect.Effect<void, IsOfflineError | UnknownError, Scope.Scope>

  pull: (
    cursor: Option<{ eventSequenceNumber: Global.Type; metadata: Option<TSyncMetadata> }>,
    options?: { live?: boolean }
  ) => Stream.Stream<PullResItem<TSyncMetadata>, IsOfflineError | InvalidPullError>

  push: (
    batch: ReadonlyArray<LiveStoreEvent.Global.Encoded>
  ) => Effect.Effect<void, IsOfflineError | InvalidPushError>

  ping: Effect.Effect<void, IsOfflineError | UnknownError | TimeoutException>
  isConnected: SubscriptionRef<boolean>
  metadata: { name: string; description: string }
  supports: { pullPageInfoKnown: boolean; pullLive: boolean }
}

LiveStore Event Structure

interface LiveStoreEvent.Global.Encoded {
  name: string                    // Event type
  args: any                       // Payload
  seqNum: Global.Type             // Global sequence number (0, 1, 2, ...)
  parentSeqNum: Global.Type       // Parent event reference
  clientId: string                // Originating client
  sessionId: string               // Session identifier
}

Compatibility Analysis

Good Matches

Append-only model: Both systems are fundamentally append-only logs
Offset-based resumption: Durable Streams' offset model maps well to LiveStore's cursor concept
Live streaming: SSE support maps directly to LiveStore's live pull mode
HTTP transport: Simple, well-understood transport layer
CDN friendliness: Historical reads can leverage CDN caching

Challenges & Mismatches

1. Sequence Number Decoupling (Medium Complexity)

Issue: Durable Streams uses opaque offsets; LiveStore expects integer sequence numbers.

Solution: Similar to S2 sync provider - maintain independent sequence tracking:

Store LiveStore seqNum inside each event payload (JSON body)
Use Durable Streams offset for cursor positioning via SyncMetadata
Never assume 1:1 correspondence between offsets and sequence numbers

type SyncMetadata = { offset: string }  // Durable Streams offset

2. No Server-Side Sequence Assignment (High Impact)

Issue: LiveStore expects the sync backend to assign/validate sequence numbers. Durable Streams just appends data - it doesn't know about LiveStore's event schema.

Implications:

Need a proxy/adapter layer (like S2's "API proxy") to:
- Validate incoming events against expected sequence
- Reject pushes that would create gaps or duplicates
- Track the current head sequence number server-side

Without proxy: Pure client-to-Durable-Streams would lose sequence validation, potentially allowing:

Duplicate events (same seqNum pushed twice)
Out-of-order events
Gap creation if client crashes mid-batch

3. Multi-Writer Coordination (Critical)

Issue: Durable Streams documentation doesn't address multi-writer scenarios. It's fundamentally a single-writer log.

Problem for LiveStore:

Multiple clients pushing events simultaneously
parentSeqNum of first event in batch must match backend head
Without coordination, concurrent pushes will conflict

Solutions (each has tradeoffs):

Approach	Pros	Cons
Single-writer per stream	Simple, no conflicts	Limits concurrency, single point of failure
Proxy with optimistic concurrency	Preserves multi-writer	Requires proxy layer, adds latency
Client-side retry on conflict	Works with vanilla DS	Retry storms under load
Fencing token per writer	Prevents conflicts	Adds complexity, still needs proxy

Recommended: Proxy layer with optimistic concurrency control:

Client → Proxy → validates seqNum → Durable Streams
                 ↓ on conflict
              return ServerAheadError

4. Batch Push Validation (Medium)

Issue: LiveStore requires batch validation (1-100 events, ascending seqNums, parentSeqNum matches head).

Challenge: Durable Streams just appends raw data - no validation layer.

Solution: Proxy must implement:

// Proxy validation logic
const validateBatch = (batch: Event[], currentHead: number) => {
  if (batch[0].parentSeqNum !== currentHead) {
    throw new ServerAheadError()
  }
  for (let i = 1; i < batch.length; i++) {
    if (batch[i].seqNum !== batch[i-1].seqNum + 1) {
      throw new InvalidPushError("Non-contiguous sequence")
    }
  }
}

5. Offset Opacity (Low Impact)

Issue: Offsets are opaque strings, not integers. Cannot calculate "remaining count" for pullPageInfoKnown.

Impact: supports.pullPageInfoKnown = false - same as current S2 provider.

6. No Backend ID Concept (Medium)

Issue: LiveStore uses backendId for detecting backend resets/migrations.

Solution Options:

Use stream path as implicit ID
Embed a UUID in the first event as "stream header"
Use HEAD request metadata to derive ID

7. Content-Type Constraints (Low)

Issue: Must use contentType: "application/json" for proper message framing.

Impact: Minimal - LiveStore events are JSON anyway.

Architecture Options

Option A: Vanilla Client (Limited)

Client ─────HTTP/SSE────→ Durable Streams

Pros: Simple, no infrastructure Cons: No sequence validation, single-writer only, no conflict detection

Suitable for: Single-user apps, demo/prototype scenarios

Option B: Proxy Layer (Recommended)

Client ─────HTTP────→ API Proxy ─────→ Durable Streams
                         │
                         └─ Validates sequences
                         └─ Handles conflicts
                         └─ Tracks head position

Pros: Full LiveStore semantics, multi-writer support Cons: Requires deploying/hosting proxy, adds latency

Implementation: Similar architecture to @livestore/sync-s2

Option C: Hybrid (Advanced)

Reads:  Client ──SSE──→ Durable Streams (direct)
Writes: Client ──HTTP──→ Proxy ──→ Durable Streams

Pros: Optimized reads via CDN, validated writes Cons: Complexity, potential consistency window

Implementation Estimate

With Proxy Layer (Option B)

Component	Effort	Notes
Client sync provider	2-3 days	Similar to S2 provider
API proxy server	3-5 days	Sequence validation, conflict handling
E2E tests	2-3 days	Including multi-writer scenarios
Documentation	1 day	Usage guide, deployment

Total: ~2 weeks

Without Proxy (Option A)

Component	Effort	Notes
Client sync provider	2-3 days	Direct DS client
Tests	1-2 days	Single-writer only

Total: ~1 week, but with significant limitations

Recommendations

If targeting production multi-user apps:

Implement Option B with a proxy layer
Proxy can be deployed as:
- Cloudflare Worker
- Edge function (Vercel/Netlify)
- Traditional server
Use the S2 sync provider as reference implementation

If targeting single-user or prototype apps:

Option A is viable
Document limitations clearly
Consider upgrade path to Option B

Questions to Clarify

Before implementation:

Who hosts the proxy? Self-hosted vs. managed service?
Durable Streams hosting? Self-hosted vs. Electric's hosted offering?
Authentication model? How do clients authenticate to DS/proxy?
Single vs multi-writer? Is multi-writer support required?
Existing Electric customers? Are they migrating from Electric's Postgres sync?

schickling/research.md

Select an option

No results found