Local audio transcription using faster-whisper (3-4x faster than OpenAI's whisper, same accuracy). Converts M4A/WAV/MP3 files to text transcripts on CPU.
Requires Python 3.11+ and ffmpeg.
# macOS prerequisites
brew install python@3.11 ffmpeg
# Clone this gist, then:
./setup.sh
# If your default python3 is too new (3.14+), point to 3.11:
PYTHON=python3.11 ./setup.shsource venv/bin/activate
# Simple (base model, English) — produces recording.txt
python transcribe_faster.py recording.m4a
# With flags — produces recording.txt (or override with --output)
python transcribe.py recording.m4a --model medium --language en --output notes.txt
deactivateOutput is always <filename>.txt next to the input file. Use --output in transcribe.py to override.
| Model | Speed | Quality | Use case |
|---|---|---|---|
tiny |
Fastest | Lower | Quick preview |
base |
Fast | Good | Default, recommended |
small |
Medium | Better | Important recordings |
medium |
Slow | High | When accuracy matters |
large-v3 |
Slowest | Best | Critical transcriptions |
| File | Purpose |
|---|---|
setup.sh |
One-command venv creation and install |
requirements.txt |
Pinned dependencies |
transcribe.py |
Full-featured script with --model, --language, --output flags |
transcribe_faster.py |
Quick script: python transcribe_faster.py file.m4a [model] [lang] |
"ffmpeg not found" — brew install ffmpeg (macOS) or sudo apt-get install ffmpeg (Linux)
"No module named 'faster_whisper'" — Activate the venv first: source venv/bin/activate
Poor quality — Use a larger model: --model medium or --model large-v3
Python too new — faster-whisper needs 3.11 or 3.12. Use PYTHON=python3.11 ./setup.sh