This is a shell script that uses ffmpeg to extract WAV from a video, transcribes it locally with Parakeet on macOS, and postprocesses the subtitles to clean it up.
In total, it took 4m for a ~2h video (~300 MB) on M1 Pro.
This guide is for power users: requires usage of terminal.