Skip to content

Instantly share code, notes, and snippets.

@dzianisv
Created February 6, 2026 22:15
Show Gist options
  • Select an option

  • Save dzianisv/69d82bb0d1a683e4c40419a68ce5fc09 to your computer and use it in GitHub Desktop.

Select an option

Save dzianisv/69d82bb0d1a683e4c40419a68ce5fc09 to your computer and use it in GitHub Desktop.
Medium blog post about AI podcast automation
<!DOCTYPE html>
<html>
<head>
<title>How I Built a Fully Automated AI Podcast from Gmail Newsletters</title>
</head>
<body>
<article>
<h1>How I Built a Fully Automated AI Podcast from Gmail Newsletters</h1>
<p><em>From email to Spotify in minutes — no manual intervention required</em></p>
<p>Last week, I published the first episode of "The Inference Times" — a podcast covering housing markets, stock markets, and tech trends. The twist? I never wrote a script, never recorded my voice, and never touched a single button on Spotify's interface.</p>
<p>Everything was automated.</p>
<p>In this post, I'll walk you through how I built a system that:</p>
<ol>
<li>Extracts newsletter content from Gmail</li>
<li>Generates natural-sounding audio using AI text-to-speech</li>
<li>Creates custom cover art using Gemini</li>
<li>Publishes directly to Spotify — all orchestrated by an AI coding agent</li>
</ol>
<h2>The Problem: Newsletter Overload</h2>
<p>Like many of you, I subscribe to several high-quality newsletters. CalculatedRisk for housing market analysis. Matt Levine's Money Stuff for finance. Benedict Evans for tech trends.</p>
<p>But here's the thing: <strong>I rarely read them.</strong></p>
<p>They pile up in my inbox, guilt-inducing reminders of content I'll "get to later." What I <em>do</em> have time for is listening during my commute.</p>
<p>So I asked myself: What if my newsletters could come to me as a podcast?</p>
<h2>The Stack</h2>
<p>Here's what powers "The Inference Times":</p>
<ul>
<li><strong>OpenCode</strong> — AI coding agent that orchestrates the entire workflow</li>
<li><strong>OpenCode Skills</strong> — Custom skill definitions for repeatable tasks</li>
<li><strong>Gmail API</strong> — Extract newsletter content</li>
<li><strong>Coqui TTS</strong> — Generate natural-sounding speech (fast, local)</li>
<li><strong>Bark</strong> — Alternative TTS for expressive speech</li>
<li><strong>Gemini</strong> — Create episode cover art</li>
<li><strong>Chrome DevTools MCP</strong> — Automate Spotify publishing</li>
</ul>
<h2>Step 1: Extracting Content from Gmail</h2>
<p>The first challenge was getting newsletter content out of Gmail in a clean, usable format. I wrote a Python script that connects to Gmail API, fetches emails matching specific criteria, and converts HTML to clean, speakable text.</p>
<p>The key insight is the html_to_podcast_script() function. Raw email HTML is full of navigation, footers, unsubscribe links, and formatting cruft. I use BeautifulSoup to remove noise and preserve paragraph structure for natural pauses.</p>
<h2>Step 2: Generating Audio with AI Text-to-Speech</h2>
<p>For text-to-speech, I evaluated several options:</p>
<ul>
<li><strong>ElevenLabs</strong> — Excellent quality, fast, but expensive</li>
<li><strong>OpenAI TTS</strong> — Great quality, API-based</li>
<li><strong>Bark</strong> — Excellent quality, slow, free</li>
<li><strong>Coqui TTS</strong> — Good quality, fast, free, local</li>
</ul>
<p>I went with Coqui TTS using the VCTK VITS model for most content. It runs entirely on my MacBook, costs nothing, and generates 3 minutes of audio in about 45 seconds.</p>
<p>The p226 voice from the VCTK dataset has a pleasant, professional British tone — perfect for financial news.</p>
<h2>Step 3: Cover Art Generation with Gemini</h2>
<p>Every episode needs cover art. Rather than use a static image, I wanted dynamic art that reflects the episode topic.</p>
<p>I use Google's Gemini model through their web interface. My OpenCode agent navigates to gemini.google.com and generates art with prompts describing the desired style and topic.</p>
<h2>Step 4: Publishing to Spotify with Chrome DevTools</h2>
<p>This is where the magic happens.</p>
<p>Spotify doesn't have a public API for podcast publishing. You have to use their web interface at creators.spotify.com. Most people would say "automation stops here."</p>
<p>Not with OpenCode.</p>
<p>Using the Chrome DevTools MCP (Model Context Protocol) server, my AI agent can navigate web pages, fill out forms, click buttons, handle authentication, and wait for page loads.</p>
<p>The entire flow runs autonomously. The agent handles edge cases like rich text editor bugs and loading states.</p>
<h2>Results</h2>
<p>Here's what "The Inference Times" Episode 1 looks like:</p>
<ul>
<li>Source: 5 CalculatedRisk emails about housing markets</li>
<li>Audio length: 3 minutes 10 seconds</li>
<li>Generation time: ~2 minutes total</li>
<li>Manual effort: Zero</li>
</ul>
<h2>Lessons Learned</h2>
<p><strong>1. Web automation is fragile but powerful</strong> — AI agents can adapt when traditional automation scripts would fail.</p>
<p><strong>2. Local TTS is good enough</strong> — Coqui's VCTK model produces perfectly listenable audio for informational content.</p>
<p><strong>3. Skills > Scripts</strong> — Packaging workflows as OpenCode Skills means the AI agent can adapt and improve the process.</p>
<h2>What's Next</h2>
<p>I'm planning to:</p>
<ul>
<li>Add more newsletter sources — Matt Levine, Benedict Evans, Stratechery</li>
<li>Implement scheduling — Auto-publish every Monday morning</li>
<li>Add intro/outro music — Using AI-generated audio beds</li>
</ul>
<p>The dream is a fully autonomous media company that transforms written content into audio content at scale.</p>
<h2>Try It Yourself</h2>
<p>If you want to build something similar:</p>
<ol>
<li>Install OpenCode: https://opencode.ai</li>
<li>Set up Coqui TTS with Python</li>
<li>Enable Chrome DevTools with --remote-debugging-port=9222</li>
<li>Create your skill in ~/.config/opencode/skills/</li>
</ol>
<p>The future of content isn't creation — it's transformation. We're swimming in high-quality written content. The bottleneck is format conversion.</p>
<p>AI agents like OpenCode make that conversion automatic.</p>
<hr>
<p><em>Den is building AI-powered tools for content transformation. Follow for more posts on automation, AI agents, and the future of media.</em></p>
</article>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment