Skip to content

Instantly share code, notes, and snippets.

@BirkhoffLee
Last active December 22, 2025 22:11
Show Gist options
  • Select an option

  • Save BirkhoffLee/45f6dab957557469a5bef19be236dc65 to your computer and use it in GitHub Desktop.

Select an option

Save BirkhoffLee/45f6dab957557469a5bef19be236dc65 to your computer and use it in GitHub Desktop.
One-liner Image to Markdown in Terminal with Vision Language Model (VLM)

Read the image from clipboard, convert it to Markdown using a system prompt template, and copy the Markdown result while rendering it to the terminal with Glow.

Usage

After setting it up you can do it with a one-liner:

$ impaste | OPENROUTER_KEY=xxx llm --template md -a - | tee >(pbcopy) >(glow -) > /dev/null
image image

Setup

First, install the simonw/llm utility with uv:

$ uv tool install --with llm-openrouter llm

Then, we need this function to read the image in the system clipboard:

# Output the image data in clipboard to stdout.
# @example impaste > /tmp/image.png
# @see https://til.simonwillison.net/macos/impaste
function impaste {
  if [[ "$OSTYPE" == darwin* ]]; then
    # macOS: use osascript
    tempfile=$(mktemp -t clipboard.XXXXXXXXXX.png)
    osascript -e 'set theImage to the clipboard as «class PNGf»' \
      -e "set theFile to open for access POSIX file \"$tempfile\" with write permission" \
      -e 'write theImage to theFile' \
      -e 'close access theFile'
    cat "$tempfile"
    rm "$tempfile"
  elif command -v xclip &> /dev/null; then
    # Linux with X11: use xclip
    xclip -selection clipboard -t image/png -o
  elif command -v wl-paste &> /dev/null; then
    # Linux with Wayland: use wl-paste
    wl-paste --type image/png
  else
    echo "Error: impaste requires osascript (macOS), xclip (X11), or wl-clipboard (Wayland)" >&2
    return 1
  fi
}

Then put md.yaml to ~/.config/llm/templates/md.yaml. Depending on your API provider, a API key will be required. I use OpenRouter.

After that install glow on your system to render Markdown in the Terminal.

model: openrouter/google/gemini-3-flash-preview
options:
temperature: 0.0
system: |
Convert all texts in the attached images to raw Markdown code (for the math expressions within, use inline latex).
## Math Expressions
Any math equations or representations found in the images should be directly written as inline LaTeX within the Markdown code.
Instead of using brackets or parentheses, wrap the LaTeX expressions with a single dollar sign $$ for inline LaTeX, and double dollar sign $$$$ for a complete latex block. Do not add whitespaces between dollar signs and the actual latex expressions.
Anything else should be written in Markdown format instead of LaTeX code.
## Strictly Prohibited
Do not output in a code block.
Do not respond anything else other than the converted texts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment