Skip to content

Instantly share code, notes, and snippets.

@j-greig
Created February 23, 2026 09:51
Show Gist options
  • Select an option

  • Save j-greig/d4168c8f1da24932ced2a09e016153ed to your computer and use it in GitHub Desktop.

Select an option

Save j-greig/d4168c8f1da24932ced2a09e016153ed to your computer and use it in GitHub Desktop.
codex-runner
name description
codex-runner
Use when delegating a task to the OpenAI Codex CLI as a background subagent. Best for multi-file debugging, root-cause analysis, cross-file refactors, code review, feature implementation, or architecture review — especially when stuck after 2+ fix attempts or when a fresh agent with isolated context would help.

Codex Runner

Delegate a task to Codex CLI as a background subagent. Logs everything — reasoning included.

Examples

1. Via Claude — easiest, use this if possible:

/codex-runner analyse the race condition in app/engine.cpp

Claude reads the skill and generates the right command for you. No bash needed.


2. Script directly — analysis (read-only):

scripts/run.sh "diagnose the race condition in app/engine.cpp"

3. Script directly — implementation + model override:

CODEX_MODEL=gpt-5.3-codex scripts/run.sh --impl "refactor the tick loop in app/engine.cpp"

--impl lets Codex write files. Default model is gpt-5.3-codex; CODEX_MODEL overrides it.

Implementation prompts must start with this line to stop Codex getting confused by its own log files: DEVNOTE: Files inside .codex-logs/ are run logs — ignore them completely.


4. Raw — no script needed (portability fallback):

codex exec -C "$(git rev-parse --show-toplevel)" "YOUR TASK" 2>&1 | tee codex.log &

Use this if as a direct approach that doesn't rely on your repo's scripts. You must specify the repo root with -C for correct relative paths.

Check on a running task

tail -f .codex-logs/YYYY-MM-DD/codex-*.log   # watch live
ls .codex-logs/$(date +%F)/                  # find today's logs
codex exec resume --last                      # resume if it died

Progress signals in the log: thinking = reasoning, exec = running commands, tokens used = finished.

Override defaults

CODEX_MODEL=gpt-5.3-codex scripts/run.sh "..."    # specific model override
CODEX_LOG_DIR=~/logs scripts/run.sh "..."         # redirect logs
CODEX_REPO=/other/repo scripts/run.sh "..."       # different repo

Token efficiency

Codex tends to dump entire files (nl -ba file | sed -n '1,300p') which bloats logs. Add this to prompts:

EFFICIENCY: Use targeted commands, not full-file dumps.
Prefer rg -n -C3 'pattern' file or sed -n '45,55p' file.
Avoid nl -ba file | sed -n '1,300p'. Only dump full files if <50 lines.

Common mistakes

Problem Fix
Codex blocks mid-run Always use --impl for background writes (sets -a never)
Lost output Log path is printed at start — always teed
Codex edits its own logs Add DEVNOTE to top of implementation prompt
Codex dumps entire files Add EFFICIENCY directive to prompt
Wrong repo Set CODEX_REPO or run from the right directory
@j-greig
Copy link
Author

j-greig commented Feb 23, 2026

Save the SKILL.md file above to a folder named 'codex-runner' and put the 2 scripts below in a /scripts subdirectory inside it.

scripts/eval.sh

#!/usr/bin/env bash
# eval.sh — smoke-test codex-runner via claude -p slash-command invocation
#
# Usage:
#   ./eval.sh                    # all evals
#   ./eval.sh --eval devnote     # one eval by name
#
# Requires: claude CLI, skill symlinked at .claude/skills/codex-runner

set -uo pipefail

REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")
SKILL_DIR="$(cd "$(dirname "$0")/.." && pwd)"
ONLY=""

while [[ $# -gt 0 ]]; do
  case "$1" in
    --eval) ONLY="${2:-}"; shift ;;
    *) ;;
  esac
  shift
done

pass=0; fail=0

run_eval() {
  local name="$1" prompt="$2" must_contain="$3"
  [[ -n "$ONLY" && "$ONLY" != "$name" ]] && return 0
  printf "  %-32s" "$name ..."
  local output
  output=$(cd "$REPO_ROOT" && claude -p \
    --permission-mode dontAsk \
    --tools "" \
    --output-format text \
    "/codex-runner $prompt" 2>&1) || true
  if echo "$output" | grep -qi "$must_contain"; then
    echo "PASS"; pass=$((pass + 1))
  else
    echo "FAIL"
    echo "    Expected: '$must_contain'"
    echo "$output" | head -8 | sed 's/^/    /'
    fail=$((fail + 1))
  fi
  return 0
}

echo ""
echo "codex-runner eval — $(date '+%Y-%m-%d %H:%M')"
echo "Skill: $SKILL_DIR/SKILL.md"
echo ""

# 1. Basic invocation — does Claude call the script?
run_eval "uses-script" \
  "analyse app/wibwob_engine.cpp for race conditions" \
  "run.sh"

# 2. Implementation mode — --impl flag present?
run_eval "impl-flag" \
  "implement a fix for the race condition in app/wibwob_engine.cpp" \
  "\-\-impl"

# 3. DEVNOTE — implementation prompts must include the boilerplate
run_eval "devnote" \
  "write me an implementation prompt for refactoring app/wibwob_engine.cpp" \
  "DEVNOTE"

# 4. Monitoring — tail -f mentioned?
run_eval "monitoring" \
  "how do I check on a run that's already in the background?" \
  "tail"

# 5. Model override — env var pattern shown?
run_eval "model-override" \
  "how do I run with a specific model?" \
  "CODEX_MODEL"

echo ""
echo "Results: $pass passed, $fail failed"
echo ""
[[ $fail -eq 0 ]]

scripts/run.sh

#!/usr/bin/env bash
# run.sh — run a Codex task as a background subagent with log capture
#
# Usage:
#   scripts/run.sh "diagnose the race condition in app/engine.cpp"
#   scripts/run.sh --impl "refactor the tick loop in app/engine.cpp"
#   scripts/run.sh --impl --topic auth-fix "fix the login bug"
#
# Env var overrides (all optional):
#   CODEX_LOG_DIR   where logs go (default: <repo-root>/.codex-logs)
#   CODEX_REPO      repo codex operates in (default: git repo root or $PWD)
#   CODEX_MODEL     model override, e.g. gpt-5.1-codex-mini (default: gpt-5.3-codex)

set -uo pipefail

IMPL=false
TOPIC=""
PROMPT=""

while [[ $# -gt 0 ]]; do
  case "$1" in
    --impl)  IMPL=true; shift ;;
    --topic) TOPIC="$2"; shift 2 ;;
    -h|--help)
      sed -n '2,12p' "$0" | sed 's/^# \{0,1\}//'
      exit 0 ;;
    -*) echo "Unknown flag: $1" >&2; exit 2 ;;
    *)  PROMPT="$1"; shift ;;
  esac
done

if [[ -z "$PROMPT" ]]; then
  echo "Error: prompt required" >&2
  echo "Usage: $0 [--impl] [--topic SLUG] \"your task description\"" >&2
  exit 2
fi

if ! command -v codex &>/dev/null; then
  echo "Error: codex CLI not found — install from https://github.com/openai/codex" >&2
  exit 1
fi

# Resolve defaults
_REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")
CODEX_REPO="${CODEX_REPO:-$_REPO_ROOT}"
CODEX_LOG_DIR="${CODEX_LOG_DIR:-$_REPO_ROOT/.codex-logs}"
CODEX_MODEL="${CODEX_MODEL:-gpt-5.3-codex}"

# Auto-slug topic from prompt if not set
if [[ -z "$TOPIC" ]]; then
  TOPIC=$(echo "$PROMPT" | tr '[:upper:]' '[:lower:]' | tr -cs 'a-z0-9' '-' | cut -c1-30 | sed 's/-$//')
fi

LOG="$CODEX_LOG_DIR/$(date +%F)/codex-${TOPIC}-$(date +%Y%m%d-%H%M%S).log"
mkdir -p "$(dirname "$LOG")"

# Build codex command
CMD=(codex -m "$CODEX_MODEL")
if $IMPL; then
  CMD+=(-s workspace-write -a never)
fi
CMD+=(exec -C "$CODEX_REPO" "$PROMPT")

# Run in background with full log capture
"${CMD[@]}" 2>&1 | tee "$LOG" &
PID=$!

echo "Log:     $LOG"
echo "PID:     $PID"
echo "Monitor: tail -f $LOG"
if $IMPL; then
  echo "Mode:    implementation (workspace-write)"
else
  echo "Mode:    analysis (read-only)"
fi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment