Use GitHub Copilot as a provider for claude-code-router, giving you access to Claude, GPT, Gemini, and other models available through your Copilot subscription.
- All Copilot models — Claude Sonnet 4.5, Claude Opus 4.6, GPT-5.2, Gemini, etc.
- GPT Codex support — Automatically routes Codex models (gpt-5.2-codex, gpt-5.1-codex, etc.) through the OpenAI Responses API instead of Chat Completions
- Hot-swap safe — Handles model switching mid-conversation (e.g. Opus → Codex) by cleaning up Chat Completions artifacts that the Responses API rejects
- xhigh reasoning — Automatically forces
reasoning_effort: "xhigh"for GPT-5.2, GPT-5.3, and GPT-5.1-codex-max models - Streaming support — Full SSE stream translation between Responses API events and Chat Completions chunks, including reasoning/thinking deltas
- Auto token refresh — Copilot tokens refresh automatically in the background
- Custom router — Fixes longContext routing for comma-prefixed models, adds thinking token counting, and enables dynamic subagent model inheritance for all unprefixed Claude models
- Context-aware statusline — Custom statusline script with per-model context limits, progress bar, and current task display. Automatically adjusts when switching models (e.g. 128K → 272K when upgrading to Codex)
npm install -g @musistudio/claude-code-router
~/.claude-code-router/
├── config.json
├── auth/
│ ├── copilot-auth.js
│ └── copilot-initial-auth.js
└── plugins/
├── copilot-transformer.js
├── custom-router.js
└── gsd-statusline-ccr.js
Important: The copilot-transformer.js require path assumes auth files are in
../auth/copilot-auth.js relative to the plugins directory. Adjust if your layout differs.
node ~/.claude-code-router/auth/copilot-initial-auth.js
This will prompt you to authenticate via GitHub device flow. Tokens are stored
in ~/.copilot-tokens.json (overridable via $COPILOT_TOKEN_FILE).
ccr start
The built-in CCR router has three issues when used with comma-prefixed Copilot models
(e.g. copilot,gpt-5.2). The custom router fixes all three:
-
Comma bypass — The built-in
getUseModel()short-circuits onmodel.includes(","), skipping the longContext threshold check entirely. The custom router always checks the threshold regardless of comma prefix. -
Thinking token undercount — CCR's
calculateTokenCount()skipstype: "thinking"content blocks, undercounting context by 30-60% during extended thinking sessions. The custom router supplements CCR's token count with an estimate of thinking block tokens (using a char/3.5 ratio — sufficient for threshold decisions at ~127K). -
Dynamic subagent model inheritance — Claude Code sends various unprefixed Claude models for subagents (haiku, opus, sonnet). The built-in router only maps haiku → the static
Router.backgroundmodel and lets others fall through unrouted. The custom router catches ALL unprefixed Claude models and dynamically inherits the current main model, so if you switch to GPT-5.2 via/model, subagents also use GPT-5.2 (or the longContext model if context is high enough).
The custom router is loaded via CUSTOM_ROUTER_PATH in config.json and runs before
the built-in getUseModel(). When it returns a model string, the built-in router is
skipped entirely.
The router also writes its final routed model to a shared state file in /tmp so the
statusline script can display the correct model and context limits in real time.
A CCR statusline script module that shows a context usage progress bar, current task, and GSD update indicator. It's designed for Copilot's per-model API limits rather than Claude Code's internal 200K window.
Key features:
- Per-model context limits — Resolves the input token limit dynamically based on
the current model (e.g. 128K for GPT-5.2, 272K for Codex). Limits are configurable
via
modelContextLimitsin the script module options. - Router state integration — Reads the final routed model from the custom router's shared state file, so the context bar recalculates correctly when the router upgrades to a long-context model (e.g. 129K tokens / 272K codex = 47% instead of 100%).
- Cache persistence — Caches the last known context percentage to a temp file so
the bar doesn't flicker to 0% while the model is thinking (Claude Code sends
contextPercent=0mid-stream). - Todo task display — Shows the currently in-progress todo task from Claude Code's session-specific todo files.
The statusline is configured as a script module in config.json's StatusLine section.
GPT Codex models (gpt-5-codex, gpt-5.1-codex, gpt-5.1-codex-mini,
gpt-5.1-codex-max, gpt-5.2-codex) only support OpenAI's Responses API — they return 400 errors
on /chat/completions.
The transformer automatically:
- Detects Codex models (any model name containing "codex")
- Rewrites the endpoint from
/chat/completionsto/responses - Converts the request body (messages → input, tool format changes, system → instructions)
- Translates Responses API streaming events back into Chat Completions chunks
- Converts non-streaming Responses back to Chat Completions JSON format
All other models continue to use /chat/completions as normal.
When you switch models mid-conversation (e.g. from Claude Opus to GPT Codex via the
longContext router), messages from the prior model may contain fields that the Responses
API rejects. The transformer handles this by:
- Shortening call IDs — The Responses API enforces a 64-character limit on
call_id. Claude Code can produce very long tool call IDs. The transformer deterministically shortens any ID over 64 characters using acall_prefix plus a SHA-256 hash, applied consistently to bothfunction_callandfunction_call_outputentries so they stay matched. - Stripping incompatible fields — Fields like
thinking,annotations,logprobs, andcache_controlthat are valid in Chat Completions but rejected by the Responses API are removed from all input messages.
The SSE stream handler translates Responses API events to Chat Completions chunks:
| Responses API Event | Chat Completions Chunk |
|---|---|
response.output_text.delta |
delta.content |
response.reasoning_summary_text.delta |
delta.thinking.content |
response.function_call_arguments.delta |
delta.tool_calls |
response.output_item.added |
(captures function call metadata for correlation) |
response.completed |
finish_reason: "stop" or "tool_use" |
Function call streaming uses output_index (not item_id) to correlate argument deltas
with their parent function call, since item_id can be obfuscated differently across events.
For models that support extended reasoning (GPT-5.2, GPT-5.3,
GPT-5.1-codex-max), the transformer automatically injects reasoning_effort: "xhigh". For
Codex models routed through the Responses API, this is mapped to the reasoning.effort field.
Do NOT add model-specific transformer entries in config.json (e.g.
"gpt-5.2-codex": { "use": [...] }). The framework's model-specific transformer chain doesn't
properly unwrap the { body, config } return format, causing the model to show as undefined
and messages to be empty. Only use the provider-level "use" array:
"transformer": {
"use": ["copilot-transformer"]
}Other tips:
- Add any models you want to the
modelsarray in the provider config. - The
Routersection lets you assign different models to different scenarios (default, background, longContext, etc.). longContextis a great fit for Codex models since they support 1M+ token context windows.CUSTOM_ROUTER_PATHshould point to your custom-router.js. Use~or an absolute path.
- Works with any GitHub account that has Copilot access (personal/business)
- Tokens are stored in a file (default:
~/.copilot-tokens.json) - The Copilot token endpoint is extracted from the token response, so it works across different Copilot deployments
- Debug logging can be enabled by setting
const DEBUG = truein copilot-transformer.js (logs to~/.claude-code-router/logs/transformer-debug.log) - Custom router debug logging is controlled by
const DEBUG = truein custom-router.js (logs to~/.claude-code-router/logs/router-debug.log)
Problem: Three issues discovered in v3:
- Claude Code doesn't only send haiku for subagents — it also sends unprefixed opus
and sonnet models (e.g.
claude-opus-4-6,claude-sonnet-4-5-20250929). The v3 router'sisHaikuModel()check missed these, causing them to fall through as PASSTHROUGH instead of inheriting the main model. - CCR's built-in
{{model}}statusline variable doesn't update until after the first API response, and the/modelcommand is client-side (generates no proxy traffic). The statusline had no way to show the correct model immediately. - When the main session crossed the longContext threshold and switched to Codex (272K limit), the context bar still calculated against the base model's 128K limit — showing 100% instead of ~47%.
Changes:
isHaikuModel()→isUnprefixedClaudeModel()— Broadened detection to catch ANY model starting with "claude" that lacks a comma prefix. This covers opus, sonnet, haiku, and any future Claude variants Claude Code might send for subagents.- Router state file — The custom router writes a shared state file
(
/tmp/ccr-router-state.json) containing the current model and timestamp on every main request. The statusline script reads this file to display the correct model name immediately, with a 30-minute expiry for stale sessions. - State file writes
finalModel— Moved the state file write to AFTER the longContext check, so it contains the final routed model (e.g.copilot,gpt-5.2-codex) rather than the base model. The statusline'sresolveContextLimit()then matches "codex" → 272K and the context bar shows the correct percentage. gsd-statusline-ccr.js— Added statusline script module with per-model context limits, router state integration, context cache persistence, and todo task display.
Problem: Three issues with the built-in CCR router when using comma-prefixed Copilot models:
- The built-in
getUseModel()short-circuits onmodel.includes(","), so models likecopilot,gpt-5.2never check thelongContextThreshold— the longContext model (e.g. gpt-5.2-codex) is never activated, causing API errors when context exceeds the model's limit. calculateTokenCount()skipstype: "thinking"content blocks, undercounting by 30-60% during extended thinking. The router sees 60% utilization when reality is 95%.- Subagents always send
claude-3-5-haikuwhich maps to the staticRouter.backgroundmodel. If you switch to GPT-5.2 via/model, subagents still use the background model instead of inheriting your choice.
Changes:
- custom-router.js added — A
CUSTOM_ROUTER_PATHscript that runs before the built-in router. Handles comma-prefixed models (main conversation), unprefixed Claude models (subagents), and falls through for anything else. - Comma bypass fix — Always checks
longContextThresholdregardless of comma in model name. - Thinking token supplement — Counts
type: "thinking"content blocks using char/3.5 estimation and adds to CCR's existingtokenCount(which already covers text, tool_use, tool_result, system, and tools). The additive approach avoids recounting everything. - Dynamic subagent inheritance — Tracks
currentMainModelin module state. When an unprefixed Claude model is detected (subagent), inherits the last main model. When the user switches models via/model, the next subagent automatically uses the new model. - config.json updated — Added
CUSTOM_ROUTER_PATHpointing tocustom-router.js.
Problem: When Claude Code switches models mid-conversation (e.g. Opus → Codex via
the longContext router), the Codex model would fail with 400 errors because the
conversation history contained artifacts from the prior Chat Completions model that the
Responses API rejects.
Changes:
shortenCallId()method added — The Responses API enforces a 64-character max oncall_idfields. Tool call IDs generated by Claude Code can be much longer. Added deterministic shortening via SHA-256 hash (call_+ 16-char hex), applied to bothfunction_callandfunction_call_outputentries so pairs stay matched.- Strip leftover Chat Completions fields — Messages from a prior model in the same
conversation may carry
thinking,annotations,logprobs, orcache_controlfields. These are now stripped before sending to the Responses API. output_item.addedtracking for streaming tool calls — The previous version relied oncall_idbeing present inresponse.function_call_arguments.deltaevents. In practice, thecall_idandnameare only reliably available in theresponse.output_item.addedevent. The transformer now captures this metadata keyed byoutput_indexand uses it to correlate subsequent argument delta events.- Reasoning/thinking stream support — Added handling for
response.reasoning_summary_text.deltaevents so reasoning content from Codex models is streamed through asdelta.thinking.contentchunks. - Debug logging — Added opt-in debug logging (
const DEBUG = true) that writes tologs/transformer-debug.log. Covers request/response flow, SSE events, and emitted chunks. Off by default. - Additional SSE events acknowledged — Events like
response.reasoning_summary_part.*,response.output_item.done,response.output_text.done, andresponse.function_call_arguments.doneare now explicitly handled instead of being passed through as unrecognized.