Skip to content

Instantly share code, notes, and snippets.

@kristopolous
Last active February 9, 2026 23:55
Show Gist options
  • Select an option

  • Save kristopolous/df4eef76472f5ab4c739c72130962268 to your computer and use it in GitHub Desktop.

Select an option

Save kristopolous/df4eef76472f5ab4c739c72130962268 to your computer and use it in GitHub Desktop.
Grabs a list of port scanned open ollama servers from the IPv4 space, filters it by model, sorts by TPS and emits an IP based on an index.
#!/bin/bash
# Example usage:
# free-ollama gemma3:latest {0..10}
#
# It's a stack based system with the syntax
# [model-name [index ... ]]*
# You can omit the index to see all of them.
#
# The column output space delimetered is
#
# ip model tps
#
# So you can do
#
# $ free-ollama model1 model2 | sort -k3,3nr | cut -d ' ' -f 1-2 | xargs -d '\n' -I {} redis-cli rpush server-pool "{}"
#
# And now you have a server pool
#
# Try instrumenting it with llcat. pipx install llcat
# And then maybe xpanes
#
# Don't know what to choose?
# Run it without arguments, you'll get a frequency distribution.
#
db="$HOME/.cache/free-ollama.json"
if [[ ! -f "$db" ]] || [[ -n "$(find "$db" -mmin +1440)" ]]; then
echo "Updating cache..." > /dev/stderr
curl -Ls 'https://awesome-ollama-server.vercel.app/data.json' > $db
fi
process_stack() {
[[ -z "$model" ]] && return
jq -r 'map(select(.models and (.models | contains(["'$model'"]))))
| sort_by(.tps)
| .['$(IFS=,; echo "${index[*]}")']
| "\(.tps | floor)\t\(.server) \(.models
| map(select(contains("'$model'")))
| join(" ")) "' "$db"
index=()
}
tester() {
if [[ -n "$dotest" ]]; then
tac | while read tps host model; do
# The < /dev/null is to avoid llcat from slurping up the stdin
timing=$( {
time timeout 5s llcat -u $host -m $model "respond with test" </dev/null >~/.cache/free-ollama.test 2>&1;
} 2>&1 | grep real | sed 's/[a-z\ ]//g' )
[[ -s ~/.cache/free-ollama.test ]] && echo -n "ok" || echo -n " "
printf " %s %-30s %s\n" $timing $host $model
done
else
cat
fi
}
if [[ -z "$1" ]]; then
jq -r 'map(.models[]) | group_by(.) | map("\(length) \(.[0])") | .[]' "$db" | sort -n
else
declare -a index
model=
while [ $# -gt 0 ]; do
case $1 in
test) dotest=1 ;;
[0-9]*) index+=( "$1" ) ;;
*) process_stack | tester; model=$1 ;;
esac
shift
done
process_stack |tester
fi
@kristopolous
Copy link
Author

This is a copy paste from forrany/Awesome-Ollama-Server#10

Thanks for your amazing work and sorry I can't write chinese.

https://gist.github.com/kristopolous/df4eef76472f5ab4c739c72130962268 is something I call "free-ollama"

Here's some examples:

First I want to find out how many servers I have for each model so I run it without arguments

$ free-ollama
...
160 mario:latest
168 gemma3:latest
185 llama3.2:3b-instruct-q5_K_M
204 lukashabtoch/plutotext-r3-emotional:latest
211 nomic-embed-text:latest
233 gemma3:270m
245 deepseek-r1:1.5b
255 llama3.1:8b
327 llama3.2:latest
1174 smollm2:135m

And now I can pick a few models. I'll get the ip of the server, the model name and the tps from your survey so I know approximately what I should be expecting

$ free-ollama qwen3:32b qwen2:1.5b | head -10
http://115.231.236.153:11434 qwen3:32b 437.90340097971585
http://5.9.30.115:11434 qwen3:32b 324.61733680023445
http://37.61.222.13:11434 qwen3:32b 276.64437864869393
http://34.148.168.193:11434 qwen3:32b 240.73307651286777
http://112.126.86.23:11434 qwen3:32b 205.56099158334925
http://209.15.123.48:11434 qwen3:32b 193.49076298520075
http://112.166.52.96:54321 qwen3:32b 190.02732783001525

I can also index these lists easy in a stack based invocation. Here I am getting indexes 2 and 3 from the first server list and 1 from the second server list

$ free-ollama qwen3:32b 2 3  qwen2:1.5b 1
http://37.61.222.13:11434 qwen3:32b 276.64437864869393
http://34.148.168.193:11434 qwen3:32b 240.73307651286777
http://20.81.150.186:11434 qwen2:1.5b 203.38983050847457

server pool

Here I create a server pool in redis sorted by tps for these models.

$ free-ollama qwen3:32b qwen2:1.5b | sort -k3,3nr | xargs -d '\n' -I {} redis-cli rpush server-pool "{}"

So now I can have a simple script

server=$(redis-cli lpop server-pool)

while true; do
work=$(redis-cli rpop work-queue)
if server fails:
server=$(redis-cli lpop server-pool)
done

So now you can have a cheap autoscaler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment