Skip to content

Instantly share code, notes, and snippets.

@kevinmichaelchen
Created February 5, 2026 14:47
Show Gist options
  • Select an option

  • Save kevinmichaelchen/0a007dd1b55b2e8e35111594cf4397b7 to your computer and use it in GitHub Desktop.

Select an option

Save kevinmichaelchen/0a007dd1b55b2e8e35111594cf4397b7 to your computer and use it in GitHub Desktop.
Blob Sync Engine: A Unified Sync Layer for Structured Data and Media - RFC/Proposal

Blob Sync Engine: A Unified Sync Layer for Structured Data and Media

A proposal for an open-source sync engine that treats blobs (images, audio, video) as first-class citizens alongside structured data.

The Problem

Modern local-first applications need to sync two fundamentally different types of data:

Type Examples Characteristics
Structured Data User records, metadata, relationships Small, frequent updates, conflict-prone
Blobs Photos, audio, video, documents Large, immutable, bandwidth-intensive

The ecosystem has mature solutions for structured data sync (Electric SQL, PowerSync, Replicache, LiveStore), but blob handling remains fragmented:

  • Manual presigned URL flows
  • Roll-your-own offline queues
  • No unified caching strategy
  • No resumable uploads out of the box
  • Cache quota management is an afterthought

Developers end up building custom solutions that don't compose well with their data sync layer.

The Vision

A single abstraction that handles both structured data and blobs with:

  • Pluggable backends - Swap storage providers without app changes
  • Offline-first - Queue mutations and uploads when disconnected
  • Smart caching - LRU eviction, quota management, prefetching
  • Resumable uploads - Large files survive network interruptions
  • Reactive subscriptions - Subscribe to data and blob URLs together

Proposed API

Configuration

import { createSyncEngine } from 'blob-sync-engine'

const sync = createSyncEngine({
  // Structured data backend
  data: {
    provider: 'electric-sql', // or 'powerSync', 'replicache', 'liveStore'
    config: {
      url: 'https://api.electric-sql.com',
      // provider-specific options
    },
  },

  // Blob storage backend
  blobs: {
    provider: 'cloudflare-r2', // or 's3', 'supabase-storage', 'gcs'
    config: {
      accountId: '...',
      bucket: 'media',
      // Presigned URL endpoint (your server)
      presignEndpoint: '/api/media/presign',
    },
  },

  // Client-side cache configuration
  cache: {
    backend: 'opfs', // or 'indexeddb', 'hybrid'
    maxSize: '500MB',
    eviction: 'lru', // or 'fifo', 'manual'
    persist: true, // Request persistent storage to avoid eviction
    prefetch: {
      enabled: true,
      strategy: 'visible', // Prefetch blobs for visible items
    },
  },

  // Offline behavior
  offline: {
    queueMutations: true,
    queueUploads: true,
    resumableUploads: true, // Use tus protocol
    maxQueueSize: '1GB',
    conflictResolution: 'last-write-wins', // or 'manual', 'server-wins'
    retryStrategy: {
      maxAttempts: 5,
      backoff: 'exponential',
    },
  },

  // Schema definitions (similar to Electric/Drizzle)
  schema: {
    photos: {
      id: 'uuid',
      title: 'string',
      takenAt: 'datetime',
      familyId: 'uuid',
      // Blob references are first-class
      image: { type: 'blob', mimeTypes: ['image/*'] },
      audio: { type: 'blob', mimeTypes: ['audio/*'], optional: true },
    },
  },
})

Mutations with Blobs

// Insert with blob - works offline
const photo = await sync.photos.insert({
  title: 'Beach Day',
  takenAt: new Date(),
  familyId: '...',
  image: imageFile, // File or Blob
  audio: audioFile, // Optional
})

// The engine:
// 1. Immediately stores blob in OPFS cache
// 2. Queues upload to R2 (or uploads immediately if online)
// 3. Creates structured data record with blob reference
// 4. Syncs metadata via Electric SQL
// 5. Returns optimistic result immediately

// Update blob
await sync.photos.update(photo.id, {
  audio: newAudioFile, // Replace audio
})

// Delete (handles both data and blob cleanup)
await sync.photos.delete(photo.id)

Reactive Subscriptions

// Subscribe to data + blob URLs together
const { data, blobUrls, isLoading, error } = useSync(
  sync.photos.where({ familyId }),
  {
    // Automatically resolve blob URLs
    includeBlobs: ['image', 'audio'],
    // Keep URLs fresh (re-sign before expiry)
    refreshUrls: true,
  }
)

// In component
return (
  <div>
    {data.map((photo) => (
      <img
        key={photo.id}
        src={blobUrls[photo.id].image} // Always valid URL
        alt={photo.title}
      />
    ))}
  </div>
)

Cache Management

// Check cache status
const stats = await sync.cache.stats()
// {
//   used: 245_000_000,      // 245 MB
//   available: 255_000_000, // 255 MB remaining
//   items: 1247,
//   pending: 3,             // Uploads queued
// }

// Manual cache operations
await sync.cache.prefetch(photoIds) // Pre-warm cache
await sync.cache.evict(photoIds) // Remove specific items
await sync.cache.clear() // Clear all cached blobs

// Set priority (won't be evicted until lower priority items gone)
await sync.cache.setPriority(photoId, 'high')

Offline Queue

// Check pending operations
const queue = await sync.offline.pending()
// [
//   { type: 'upload', id: '...', size: 4_500_000, progress: 0.45 },
//   { type: 'insert', table: 'photos', data: {...} },
// ]

// Manual sync trigger
await sync.offline.flush()

// Listen to sync events
sync.on('online', () => console.log('Back online, syncing...'))
sync.on('uploadProgress', ({ id, progress }) => updateUI(id, progress))
sync.on('uploadComplete', ({ id, url }) => console.log('Uploaded:', url))
sync.on('syncError', ({ operation, error }) => handleError(error))

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Application                               │
├─────────────────────────────────────────────────────────────────┤
│                     Blob Sync Engine                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │   Schema    │  │  Mutation   │  │     Subscription        │  │
│  │  Registry   │  │   Queue     │  │       Manager           │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Cache Manager                             ││
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌─────────────┐ ││
│  │  │   OPFS   │  │ IndexedDB│  │  Quota   │  │    LRU      │ ││
│  │  │ Adapter  │  │ Adapter  │  │ Monitor  │  │  Evictor    │ ││
│  │  └──────────┘  └──────────┘  └──────────┘  └─────────────┘ ││
│  └─────────────────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Backend Adapters                          ││
│  │  ┌──────────────────────┐  ┌──────────────────────────────┐ ││
│  │  │    Data Providers    │  │      Blob Providers          │ ││
│  │  │  • Electric SQL      │  │  • Cloudflare R2             │ ││
│  │  │  • PowerSync         │  │  • AWS S3                    │ ││
│  │  │  • Replicache        │  │  • Supabase Storage          │ ││
│  │  │  • LiveStore         │  │  • Google Cloud Storage      │ ││
│  │  └──────────────────────┘  └──────────────────────────────┘ ││
│  └─────────────────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    Upload Manager                            ││
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌─────────────┐ ││
│  │  │   tus    │  │ Chunked  │  │  Retry   │  │  Progress   │ ││
│  │  │ Protocol │  │ Uploads  │  │  Logic   │  │  Tracking   │ ││
│  │  └──────────┘  └──────────┘  └──────────┘  └─────────────┘ ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Key Components

1. Cache Manager

Handles client-side blob storage with quota awareness:

  • OPFS Adapter: Primary storage using Origin Private File System
  • IndexedDB Adapter: Fallback for browsers without OPFS
  • Quota Monitor: Tracks navigator.storage.estimate(), requests persistence
  • LRU Evictor: Automatically removes least-recently-used items when near quota

2. Upload Manager

Handles reliable blob uploads:

  • tus Protocol: Resumable uploads that survive network interruptions
  • Chunked Uploads: Split large files for better reliability
  • Retry Logic: Exponential backoff with jitter
  • Progress Tracking: Real-time upload progress events

3. Mutation Queue

Persists pending operations for offline support:

  • IndexedDB Persistence: Survives page refresh/close
  • Operation Ordering: Maintains causality (create before update)
  • Conflict Detection: Identifies server/client divergence
  • Background Sync: Uses Service Worker Background Sync API when available

4. Subscription Manager

Reactive data + blob URL management:

  • Unified Subscriptions: Subscribe to data and blob URLs together
  • URL Refresh: Automatically re-sign presigned URLs before expiry
  • Optimistic Updates: Immediately reflect local changes
  • Prefetching: Pre-warm cache for visible/likely-needed items

Storage Quota Handling

Browser storage limits are a real constraint:

Browser Default Quota
Chrome ~80% of disk space
Firefox ~50% of free disk space
Safari ~1GB (can request more)

The engine handles this via:

// Automatic quota management
cache: {
  maxSize: '500MB', // Stay under browser limits
  eviction: 'lru',
  warningThreshold: 0.8, // Emit warning at 80% full
  criticalThreshold: 0.95, // Aggressive eviction at 95%

  // Priority system
  priorities: {
    high: 'never-evict', // User-pinned items
    normal: 'lru', // Standard eviction
    low: 'eager-evict', // Thumbnails, previews
  },
}

// Events for app-level handling
sync.on('cacheWarning', ({ used, available }) => {
  showToast('Storage nearly full. Some items may be removed.')
})

sync.on('cacheEviction', ({ items }) => {
  console.log(`Evicted ${items.length} items to free space`)
})

Prior Art & Dependencies

Building on proven technologies:

Component Technology
Resumable uploads tus protocol
Structured data sync Electric SQL, PowerSync
OPFS access Native navigator.storage.getDirectory()
IndexedDB wrapper idb
Background sync Workbox
Schema validation Effect Schema, Zod

Use Cases

  1. Family photo apps (like Ohana) - Photos, audio memories, metadata
  2. Note-taking apps - Text, embedded images, file attachments
  3. Design tools - Projects, assets, version history
  4. Music apps - Playlists, audio files, album art
  5. Document management - PDFs, Office files, annotations
  6. Social apps - Posts, images, videos, comments

Open Questions

  1. Conflict resolution for blobs - Last-write-wins? Version history?
  2. Compression - Compress blobs before caching/upload?
  3. Thumbnails - Generate and cache thumbnails separately?
  4. Encryption - Client-side encryption before upload?
  5. Sharing - How do blob references work across users/families?
  6. Server component - How much server-side logic is needed?

Next Steps

  1. Prototype - Build minimal viable version with OPFS + R2
  2. Benchmarks - Test with realistic media loads
  3. Provider adapters - Start with R2, add S3/Supabase
  4. React bindings - useSync hook with blob URL resolution
  5. Documentation - API reference, guides, examples

This is a proposal/RFC. Feedback welcome.

Inspired by building Ohana, a family photo archive app where this problem became very real.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment