Skip to content

Instantly share code, notes, and snippets.

@chochinlu
Created December 21, 2025 12:30
Show Gist options
  • Select an option

  • Save chochinlu/2682970d15d54bdf6f382efe3f3ac91c to your computer and use it in GitHub Desktop.

Select an option

Save chochinlu/2682970d15d54bdf6f382efe3f3ac91c to your computer and use it in GitHub Desktop.
Fix Web Speech API cold start issue - first utterance clipping

Web Speech API Cold Start Fix

Fix the issue where the beginning of speech gets clipped when using speechSynthesis

Problem

When using the Web Speech API (window.speechSynthesis), the first few syllables of speech often get cut off or clipped. This is a common issue, especially in Chrome-based browsers.

Root Causes

  1. Async voice loading: Chrome loads the voice list asynchronously. The first call to getVoices() returns an empty array.
  2. Lazy engine initialization: Browsers only initialize the speech synthesis engine when first used.
  3. User activation requirement: Since Chrome M71, speechSynthesis.speak() requires prior user interaction.

Solution

Warm up the speech engine when the page loads by:

  1. Waiting for the voice list to fully load (using the voiceschanged event)
  2. Playing a silent utterance to trigger complete engine initialization

Code

speech.service.js

/**
 * Warmup speech engine - call once on page load
 * Fixes first utterance clipping issue
 *
 * Chrome loads voices asynchronously, so we need to:
 * 1. Wait for voiceschanged event to ensure voice list is ready
 * 2. Play a silent utterance to trigger full engine initialization
 */
export const warmupSpeechEngine = () => {
  if (!('speechSynthesis' in window)) return;

  // Wait for voice list to load
  const loadVoices = () => {
    return new Promise((resolve) => {
      let voices = speechSynthesis.getVoices();
      if (voices.length > 0) {
        resolve(voices);
        return;
      }

      // Chrome needs to wait for voiceschanged event
      const handleVoicesChanged = () => {
        voices = speechSynthesis.getVoices();
        if (voices.length > 0) {
          resolve(voices);
        }
      };

      speechSynthesis.addEventListener('voiceschanged', handleVoicesChanged, { once: true });

      // Timeout after 2 seconds to avoid infinite wait
      setTimeout(() => resolve(speechSynthesis.getVoices()), 2000);
    });
  };

  // After voices loaded, play silent utterance to complete warmup
  loadVoices().then(() => {
    const warmup = new SpeechSynthesisUtterance('');
    warmup.volume = 0;
    warmup.rate = 10;
    speechSynthesis.speak(warmup);
  });
};

/**
 * Play text pronunciation
 */
export const playPronunciation = (text, lang = 'en-US') => {
  return new Promise((resolve, reject) => {
    if (!text) {
      reject(new Error('No text to play'));
      return;
    }

    if (!('speechSynthesis' in window)) {
      reject(new Error('Speech synthesis not supported'));
      return;
    }

    // Cancel any current playback
    window.speechSynthesis.cancel();

    const utterance = new SpeechSynthesisUtterance(text);
    utterance.lang = lang;
    utterance.rate = 0.9;
    utterance.pitch = 1;
    utterance.volume = 1;

    utterance.onend = () => resolve();
    utterance.onerror = (e) => reject(e);

    window.speechSynthesis.speak(utterance);
  });
};

Usage with React

import { useEffect } from 'react';
import { warmupSpeechEngine } from './speech.service';

function App() {
  // Warmup speech engine on page load
  useEffect(() => {
    warmupSpeechEngine();
  }, []);

  return (
    // Your app content
  );
}

Usage with Vanilla JavaScript

// Call on page load
document.addEventListener('DOMContentLoaded', () => {
  warmupSpeechEngine();
});

// Or call immediately if script is at end of body
warmupSpeechEngine();

How It Works

  1. loadVoices(): Returns a Promise that resolves when voices are available

    • First tries getVoices() directly (works in Firefox/Safari)
    • If empty, listens for voiceschanged event (needed for Chrome)
    • Has a 2-second timeout as fallback
  2. Silent utterance: After voices load, plays an empty string with volume = 0

    • User hears nothing
    • But this triggers full engine initialization
    • Subsequent speak() calls work without clipping

Key Points

  • Call warmupSpeechEngine() as early as possible (on page load)
  • The warmup is completely silent
  • After warmup, all speech playback works correctly
  • The 2-second timeout handles edge cases where voiceschanged never fires

Browser Compatibility

  • ✅ Chrome / Edge (Chromium)
  • ✅ Firefox
  • ✅ Safari
  • ⚠️ Mobile browsers may have additional restrictions

References


License

MIT - Feel free to use in your projects!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment