Skip to content

Instantly share code, notes, and snippets.

@jmanhype
Created December 26, 2025 02:51
Show Gist options
  • Select an option

  • Save jmanhype/225be66f82afb2f4b00d8a44a553f70a to your computer and use it in GitHub Desktop.

Select an option

Save jmanhype/225be66f82afb2f4b00d8a44a553f70a to your computer and use it in GitHub Desktop.
3090 Local AI Stack - Deterministic + LLM-augmented software generation

3090 Local AI Stack Setup

Deterministic + LLM-augmented software generation for RTX 3090 (24GB)

Overview

This stack combines Sean Chatman's deterministic tools with local LLM inference on your RTX 3090.

┌─────────────────────────────────────────────────────┐
│                  RTX 3090 (24GB)                    │
│   Ollama + qwen2.5-coder:32b-instruct-q4_K_M       │
└─────────────────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────┐
│              DETERMINISTIC LAYER                    │
│   spec-kit  │  ggen  │  gitvan  │  claude-flow     │
└─────────────────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────┐
│          VALIDATED OUTPUT WITH PROVENANCE           │
└─────────────────────────────────────────────────────┘

Components

Tool Purpose Language
Ollama Local LLM inference server Go
qwen2.5-coder Code generation model (32B Q4) -
spec-kit Spec-driven development workflow Python
ggen Ontology → deterministic codegen Rust
gitvan Git-native workflow automation Node.js
claude-flow Multi-agent swarm orchestration Node.js

Quick Install

curl -fsSL https://gist.githubusercontent.com/YOUR_USERNAME/GIST_ID/raw/setup-3090-stack.sh | bash

Or step by step:

# 1. Ollama (LLM server)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5-coder:32b-instruct-q4_K_M

# 2. Spec-Kit (spec-driven development)
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git

# 3. ggen (deterministic codegen) - choose one:
brew install seanchatmangpt/ggen/ggen          # macOS
cargo install ggen-cli-lib                      # Any platform

# 4. gitvan (workflow automation)
npm install -g gitvan

# 5. claude-flow (optional, multi-agent)
npx claude-flow@alpha init --force

Model Options for 24GB VRAM

Model VRAM Speed Quality Command
qwen2.5-coder:32b-q4 ~18GB Medium ⭐⭐⭐⭐⭐ ollama pull qwen2.5-coder:32b-instruct-q4_K_M
deepseek-coder-v2:16b ~12GB Fast ⭐⭐⭐⭐ ollama pull deepseek-coder-v2:16b
codestral:22b-q5 ~16GB Medium ⭐⭐⭐⭐ ollama pull codestral:22b-v0.1-q5_K_M
qwen2.5-coder:14b ~10GB Fast ⭐⭐⭐ ollama pull qwen2.5-coder:14b-instruct

Recommendation: Start with qwen2.5-coder:32b for best quality. Drop to deepseek-coder-v2:16b if you need faster iteration.

Usage

Start LLM Server

ollama serve
# API available at http://localhost:11434
# OpenAI-compatible: http://localhost:11434/v1

Spec-Driven Development

# Initialize project
specify init my-project --ai claude
cd my-project

# Follow the workflow
/speckit.constitution  # Set project principles
/speckit.specify "Build a REST API for user management"
/speckit.clarify       # Optional: refine requirements
/speckit.plan "Use FastAPI with PostgreSQL"
/speckit.tasks         # Generate task breakdown
/speckit.implement     # Execute implementation

Deterministic Code Generation (ggen)

# Initialize ggen in project
ggen init

# Create ontology (schema/domain.ttl)
# Create templates (templates/*.tera)
# Generate code
ggen sync

Multi-Agent Swarm (claude-flow)

# Quick task
npx claude-flow@alpha swarm "build REST API" --claude

# Complex project with hive-mind
npx claude-flow@alpha hive-mind wizard
npx claude-flow@alpha hive-mind spawn "enterprise system" --claude

Workflow Automation (gitvan)

# Initialize in project
gitvan workflow init

# Create workflow (.gitvan/workflows/build.ttl)
gitvan workflow list
gitvan workflow run BuildAndTest

# Install as git hook
gitvan hook install pre-commit LintOnCommit

Integration with BLACKICE

This stack integrates with the BLACKICE dispatcher:

from integrations.dispatcher import dispatch

# Optimization → ai-factory
result = dispatch("Optimize delivery routes for 10 stops")

# Feature spec → speckit
result = dispatch("Add user authentication with OAuth")

# Code generation → LLM (your 3090)
result = dispatch("Generate unit tests for UserService")

Environment Variables

# Add to ~/.bashrc or ~/.zshrc
export OLLAMA_HOST=localhost:11434
export OLLAMA_MODEL=qwen2.5-coder:32b-instruct-q4_K_M

# For vLLM (alternative to Ollama)
export VLLM_HOST=localhost
export VLLM_PORT=8000

Troubleshooting

CUDA out of memory

# Use smaller quantization
ollama pull qwen2.5-coder:32b-instruct-q3_K_M

# Or smaller model
ollama pull deepseek-coder-v2:16b

Slow generation

# Check GPU utilization
nvidia-smi

# Ensure CUDA is being used
ollama ps

spec-kit not finding commands

# Reinstall
uv tool uninstall specify-cli
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git

# Verify
specify check

Links

License

MIT

#!/usr/bin/env bash
#
# 3090 Local AI Stack Setup
# Deterministic + LLM-augmented software generation
#
# Usage: curl -fsSL <gist-url>/setup-3090-stack.sh | bash
#
set -e
echo "╔═══════════════════════════════════════════════════════════════╗"
echo "║ 3090 LOCAL AI STACK SETUP ║"
echo "║ Deterministic + LLM-augmented software generation ║"
echo "╚═══════════════════════════════════════════════════════════════╝"
echo ""
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
success() { echo -e "${GREEN}✓${NC} $1"; }
info() { echo -e "${BLUE}→${NC} $1"; }
warn() { echo -e "${YELLOW}!${NC} $1"; }
error() { echo -e "${RED}✗${NC} $1"; }
# Check prerequisites
check_prereqs() {
echo "Checking prerequisites..."
# Check for NVIDIA GPU
if command -v nvidia-smi &> /dev/null; then
GPU_MEM=$(nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits | head -1)
success "NVIDIA GPU detected (${GPU_MEM}MB VRAM)"
else
warn "nvidia-smi not found. GPU acceleration may not work."
fi
# Check for curl
if ! command -v curl &> /dev/null; then
error "curl is required but not installed"
exit 1
fi
success "curl found"
# Check for Node.js
if command -v node &> /dev/null; then
NODE_VER=$(node --version)
success "Node.js found ($NODE_VER)"
else
warn "Node.js not found. gitvan and claude-flow require Node.js 18+"
fi
# Check for Python/uv
if command -v uv &> /dev/null; then
success "uv found"
else
warn "uv not found. Installing..."
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.cargo/bin:$PATH"
success "uv installed"
fi
# Check for Rust/Cargo
if command -v cargo &> /dev/null; then
CARGO_VER=$(cargo --version)
success "Cargo found ($CARGO_VER)"
else
warn "Cargo not found. ggen installation via cargo will be skipped."
fi
echo ""
}
# Install Ollama
install_ollama() {
echo "Installing Ollama..."
if command -v ollama &> /dev/null; then
success "Ollama already installed"
else
info "Downloading Ollama..."
curl -fsSL https://ollama.com/install.sh | sh
success "Ollama installed"
fi
echo ""
}
# Pull recommended models
pull_models() {
echo "Pulling recommended models for 24GB VRAM..."
info "Pulling qwen2.5-coder:32b (primary model, ~18GB)..."
ollama pull qwen2.5-coder:32b-instruct-q4_K_M || warn "Failed to pull qwen2.5-coder:32b"
info "Pulling deepseek-coder-v2:16b (fast alternative, ~12GB)..."
ollama pull deepseek-coder-v2:16b || warn "Failed to pull deepseek-coder-v2:16b"
success "Models pulled"
echo ""
}
# Install spec-kit
install_speckit() {
echo "Installing spec-kit..."
if command -v specify &> /dev/null; then
success "spec-kit already installed"
else
info "Installing via uv..."
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git
success "spec-kit installed"
fi
echo ""
}
# Install ggen
install_ggen() {
echo "Installing ggen..."
if command -v ggen &> /dev/null; then
success "ggen already installed"
return
fi
# Try Homebrew first (fastest on macOS)
if command -v brew &> /dev/null; then
info "Installing via Homebrew..."
brew install seanchatmangpt/ggen/ggen && success "ggen installed via Homebrew" && return
fi
# Fall back to Cargo
if command -v cargo &> /dev/null; then
info "Installing via Cargo (this may take a few minutes)..."
cargo install ggen-cli-lib && success "ggen installed via Cargo" && return
fi
warn "Could not install ggen. Install manually: cargo install ggen-cli-lib"
echo ""
}
# Install gitvan
install_gitvan() {
echo "Installing gitvan..."
if command -v gitvan &> /dev/null; then
success "gitvan already installed"
elif command -v npm &> /dev/null; then
info "Installing via npm..."
npm install -g gitvan
success "gitvan installed"
else
warn "npm not found. Skipping gitvan installation."
fi
echo ""
}
# Install claude-flow
install_claudeflow() {
echo "Setting up claude-flow..."
if command -v npx &> /dev/null; then
info "Initializing claude-flow..."
npx claude-flow@alpha --version || true
success "claude-flow ready (use: npx claude-flow@alpha)"
else
warn "npx not found. Skipping claude-flow setup."
fi
echo ""
}
# Configure environment
configure_env() {
echo "Configuring environment..."
SHELL_RC=""
if [ -f "$HOME/.zshrc" ]; then
SHELL_RC="$HOME/.zshrc"
elif [ -f "$HOME/.bashrc" ]; then
SHELL_RC="$HOME/.bashrc"
fi
if [ -n "$SHELL_RC" ]; then
# Check if already configured
if ! grep -q "OLLAMA_HOST" "$SHELL_RC"; then
echo "" >> "$SHELL_RC"
echo "# 3090 AI Stack Configuration" >> "$SHELL_RC"
echo "export OLLAMA_HOST=localhost:11434" >> "$SHELL_RC"
echo "export OLLAMA_MODEL=qwen2.5-coder:32b-instruct-q4_K_M" >> "$SHELL_RC"
success "Environment variables added to $SHELL_RC"
else
success "Environment already configured"
fi
else
warn "Could not find shell rc file. Add these to your shell config:"
echo " export OLLAMA_HOST=localhost:11434"
echo " export OLLAMA_MODEL=qwen2.5-coder:32b-instruct-q4_K_M"
fi
echo ""
}
# Print summary
print_summary() {
echo "╔═══════════════════════════════════════════════════════════════╗"
echo "║ INSTALLATION COMPLETE ║"
echo "╚═══════════════════════════════════════════════════════════════╝"
echo ""
echo "Installed components:"
command -v ollama &> /dev/null && success "Ollama (LLM server)" || warn "Ollama"
command -v specify &> /dev/null && success "spec-kit (spec-driven dev)" || warn "spec-kit"
command -v ggen &> /dev/null && success "ggen (deterministic codegen)" || warn "ggen"
command -v gitvan &> /dev/null && success "gitvan (workflow automation)" || warn "gitvan"
command -v npx &> /dev/null && success "claude-flow (multi-agent)" || warn "claude-flow"
echo ""
echo "Quick start:"
echo " 1. Start Ollama: ollama serve"
echo " 2. Test generation: ollama run qwen2.5-coder:32b-instruct-q4_K_M"
echo " 3. New project: specify init my-project --ai claude"
echo " 4. Multi-agent: npx claude-flow@alpha swarm 'build REST API'"
echo ""
echo "Documentation: https://gist.github.com/jmanhype/YOUR_GIST_ID"
echo ""
}
# Main
main() {
check_prereqs
install_ollama
pull_models
install_speckit
install_ggen
install_gitvan
install_claudeflow
configure_env
print_summary
}
# Run
main "$@"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment