Skip to content

Instantly share code, notes, and snippets.

@dpearson2699
Forked from joshbalfour/config.json
Last active February 10, 2026 03:29
Show Gist options
  • Select an option

  • Save dpearson2699/d7e797a85b4286a822dcb9d00f2bebe8 to your computer and use it in GitHub Desktop.

Select an option

Save dpearson2699/d7e797a85b4286a822dcb9d00f2bebe8 to your computer and use it in GitHub Desktop.
Copilot Support for Claude-Code-Router with GPT Codex (Responses API) support, hot-swap fixes, and xhigh reasoning

claude-code-router Copilot Provider

Use GitHub Copilot as a provider for claude-code-router, giving you access to Claude, GPT, Gemini, and other models available through your Copilot subscription.

Features

  • All Copilot models — Claude Sonnet 4.5, Claude Opus 4.6, GPT-5.2, Gemini, etc.
  • GPT Codex support — Automatically routes Codex models (gpt-5.2-codex, gpt-5.1-codex, etc.) through the OpenAI Responses API instead of Chat Completions
  • Hot-swap safe — Handles model switching mid-conversation (e.g. Opus → Codex) by cleaning up Chat Completions artifacts that the Responses API rejects
  • xhigh reasoning — Automatically forces reasoning_effort: "xhigh" for GPT-5.2, GPT-5.3, and GPT-5.1-codex-max models
  • Streaming support — Full SSE stream translation between Responses API events and Chat Completions chunks, including reasoning/thinking deltas
  • Auto token refresh — Copilot tokens refresh automatically in the background
  • Custom router — Fixes longContext routing for comma-prefixed models, adds thinking token counting, and enables dynamic subagent model inheritance for all unprefixed Claude models
  • Context-aware statusline — Custom statusline script with per-model context limits, progress bar, and current task display. Automatically adjusts when switching models (e.g. 128K → 272K when upgrading to Codex)

Setup

1. Install claude-code-router

npm install -g @musistudio/claude-code-router

2. Place the files

~/.claude-code-router/
├── config.json
├── auth/
│   ├── copilot-auth.js
│   └── copilot-initial-auth.js
└── plugins/
    ├── copilot-transformer.js
    ├── custom-router.js
    └── gsd-statusline-ccr.js

Important: The copilot-transformer.js require path assumes auth files are in ../auth/copilot-auth.js relative to the plugins directory. Adjust if your layout differs.

3. Authenticate

node ~/.claude-code-router/auth/copilot-initial-auth.js

This will prompt you to authenticate via GitHub device flow. Tokens are stored in ~/.copilot-tokens.json (overridable via $COPILOT_TOKEN_FILE).

4. Start the router

ccr start

How It Works

Custom Router (custom-router.js)

The built-in CCR router has three issues when used with comma-prefixed Copilot models (e.g. copilot,gpt-5.2). The custom router fixes all three:

  1. Comma bypass — The built-in getUseModel() short-circuits on model.includes(","), skipping the longContext threshold check entirely. The custom router always checks the threshold regardless of comma prefix.

  2. Thinking token undercount — CCR's calculateTokenCount() skips type: "thinking" content blocks, undercounting context by 30-60% during extended thinking sessions. The custom router supplements CCR's token count with an estimate of thinking block tokens (using a char/3.5 ratio — sufficient for threshold decisions at ~127K).

  3. Dynamic subagent model inheritance — Claude Code sends various unprefixed Claude models for subagents (haiku, opus, sonnet). The built-in router only maps haiku → the static Router.background model and lets others fall through unrouted. The custom router catches ALL unprefixed Claude models and dynamically inherits the current main model, so if you switch to GPT-5.2 via /model, subagents also use GPT-5.2 (or the longContext model if context is high enough).

The custom router is loaded via CUSTOM_ROUTER_PATH in config.json and runs before the built-in getUseModel(). When it returns a model string, the built-in router is skipped entirely.

The router also writes its final routed model to a shared state file in /tmp so the statusline script can display the correct model and context limits in real time.

Statusline (gsd-statusline-ccr.js)

A CCR statusline script module that shows a context usage progress bar, current task, and GSD update indicator. It's designed for Copilot's per-model API limits rather than Claude Code's internal 200K window.

Key features:

  • Per-model context limits — Resolves the input token limit dynamically based on the current model (e.g. 128K for GPT-5.2, 272K for Codex). Limits are configurable via modelContextLimits in the script module options.
  • Router state integration — Reads the final routed model from the custom router's shared state file, so the context bar recalculates correctly when the router upgrades to a long-context model (e.g. 129K tokens / 272K codex = 47% instead of 100%).
  • Cache persistence — Caches the last known context percentage to a temp file so the bar doesn't flicker to 0% while the model is thinking (Claude Code sends contextPercent=0 mid-stream).
  • Todo task display — Shows the currently in-progress todo task from Claude Code's session-specific todo files.

The statusline is configured as a script module in config.json's StatusLine section.

Codex / Responses API

GPT Codex models (gpt-5-codex, gpt-5.1-codex, gpt-5.1-codex-mini, gpt-5.1-codex-max, gpt-5.2-codex) only support OpenAI's Responses API — they return 400 errors on /chat/completions.

The transformer automatically:

  1. Detects Codex models (any model name containing "codex")
  2. Rewrites the endpoint from /chat/completions to /responses
  3. Converts the request body (messages → input, tool format changes, system → instructions)
  4. Translates Responses API streaming events back into Chat Completions chunks
  5. Converts non-streaming Responses back to Chat Completions JSON format

All other models continue to use /chat/completions as normal.

Hot-Swap / Mid-Conversation Model Switching

When you switch models mid-conversation (e.g. from Claude Opus to GPT Codex via the longContext router), messages from the prior model may contain fields that the Responses API rejects. The transformer handles this by:

  1. Shortening call IDs — The Responses API enforces a 64-character limit on call_id. Claude Code can produce very long tool call IDs. The transformer deterministically shortens any ID over 64 characters using a call_ prefix plus a SHA-256 hash, applied consistently to both function_call and function_call_output entries so they stay matched.
  2. Stripping incompatible fields — Fields like thinking, annotations, logprobs, and cache_control that are valid in Chat Completions but rejected by the Responses API are removed from all input messages.

Streaming Event Translation

The SSE stream handler translates Responses API events to Chat Completions chunks:

Responses API Event Chat Completions Chunk
response.output_text.delta delta.content
response.reasoning_summary_text.delta delta.thinking.content
response.function_call_arguments.delta delta.tool_calls
response.output_item.added (captures function call metadata for correlation)
response.completed finish_reason: "stop" or "tool_use"

Function call streaming uses output_index (not item_id) to correlate argument deltas with their parent function call, since item_id can be obfuscated differently across events.

xhigh Reasoning

For models that support extended reasoning (GPT-5.2, GPT-5.3, GPT-5.1-codex-max), the transformer automatically injects reasoning_effort: "xhigh". For Codex models routed through the Responses API, this is mapped to the reasoning.effort field.

Config Notes

Do NOT add model-specific transformer entries in config.json (e.g. "gpt-5.2-codex": { "use": [...] }). The framework's model-specific transformer chain doesn't properly unwrap the { body, config } return format, causing the model to show as undefined and messages to be empty. Only use the provider-level "use" array:

"transformer": {
  "use": ["copilot-transformer"]
}

Other tips:

  • Add any models you want to the models array in the provider config.
  • The Router section lets you assign different models to different scenarios (default, background, longContext, etc.).
  • longContext is a great fit for Codex models since they support 1M+ token context windows.
  • CUSTOM_ROUTER_PATH should point to your custom-router.js. Use ~ or an absolute path.

Notes

  • Works with any GitHub account that has Copilot access (personal/business)
  • Tokens are stored in a file (default: ~/.copilot-tokens.json)
  • The Copilot token endpoint is extracted from the token response, so it works across different Copilot deployments
  • Debug logging can be enabled by setting const DEBUG = true in copilot-transformer.js (logs to ~/.claude-code-router/logs/transformer-debug.log)
  • Custom router debug logging is controlled by const DEBUG = true in custom-router.js (logs to ~/.claude-code-router/logs/router-debug.log)

Revisions

v4 — 2026-02-09: Broader subagent detection, statusline state sharing, context bar fix

Problem: Three issues discovered in v3:

  1. Claude Code doesn't only send haiku for subagents — it also sends unprefixed opus and sonnet models (e.g. claude-opus-4-6, claude-sonnet-4-5-20250929). The v3 router's isHaikuModel() check missed these, causing them to fall through as PASSTHROUGH instead of inheriting the main model.
  2. CCR's built-in {{model}} statusline variable doesn't update until after the first API response, and the /model command is client-side (generates no proxy traffic). The statusline had no way to show the correct model immediately.
  3. When the main session crossed the longContext threshold and switched to Codex (272K limit), the context bar still calculated against the base model's 128K limit — showing 100% instead of ~47%.

Changes:

  1. isHaikuModel()isUnprefixedClaudeModel() — Broadened detection to catch ANY model starting with "claude" that lacks a comma prefix. This covers opus, sonnet, haiku, and any future Claude variants Claude Code might send for subagents.
  2. Router state file — The custom router writes a shared state file (/tmp/ccr-router-state.json) containing the current model and timestamp on every main request. The statusline script reads this file to display the correct model name immediately, with a 30-minute expiry for stale sessions.
  3. State file writes finalModel — Moved the state file write to AFTER the longContext check, so it contains the final routed model (e.g. copilot,gpt-5.2-codex) rather than the base model. The statusline's resolveContextLimit() then matches "codex" → 272K and the context bar shows the correct percentage.
  4. gsd-statusline-ccr.js — Added statusline script module with per-model context limits, router state integration, context cache persistence, and todo task display.

v3 — 2025-02-09: Custom router for longContext, thinking tokens, and subagent inheritance

Problem: Three issues with the built-in CCR router when using comma-prefixed Copilot models:

  1. The built-in getUseModel() short-circuits on model.includes(","), so models like copilot,gpt-5.2 never check the longContextThreshold — the longContext model (e.g. gpt-5.2-codex) is never activated, causing API errors when context exceeds the model's limit.
  2. calculateTokenCount() skips type: "thinking" content blocks, undercounting by 30-60% during extended thinking. The router sees 60% utilization when reality is 95%.
  3. Subagents always send claude-3-5-haiku which maps to the static Router.background model. If you switch to GPT-5.2 via /model, subagents still use the background model instead of inheriting your choice.

Changes:

  1. custom-router.js added — A CUSTOM_ROUTER_PATH script that runs before the built-in router. Handles comma-prefixed models (main conversation), unprefixed Claude models (subagents), and falls through for anything else.
  2. Comma bypass fix — Always checks longContextThreshold regardless of comma in model name.
  3. Thinking token supplement — Counts type: "thinking" content blocks using char/3.5 estimation and adds to CCR's existing tokenCount (which already covers text, tool_use, tool_result, system, and tools). The additive approach avoids recounting everything.
  4. Dynamic subagent inheritance — Tracks currentMainModel in module state. When an unprefixed Claude model is detected (subagent), inherits the last main model. When the user switches models via /model, the next subagent automatically uses the new model.
  5. config.json updated — Added CUSTOM_ROUTER_PATH pointing to custom-router.js.

v2 — 2025-02-09: Codex hot-swap fixes

Problem: When Claude Code switches models mid-conversation (e.g. Opus → Codex via the longContext router), the Codex model would fail with 400 errors because the conversation history contained artifacts from the prior Chat Completions model that the Responses API rejects.

Changes:

  1. shortenCallId() method added — The Responses API enforces a 64-character max on call_id fields. Tool call IDs generated by Claude Code can be much longer. Added deterministic shortening via SHA-256 hash (call_ + 16-char hex), applied to both function_call and function_call_output entries so pairs stay matched.
  2. Strip leftover Chat Completions fields — Messages from a prior model in the same conversation may carry thinking, annotations, logprobs, or cache_control fields. These are now stripped before sending to the Responses API.
  3. output_item.added tracking for streaming tool calls — The previous version relied on call_id being present in response.function_call_arguments.delta events. In practice, the call_id and name are only reliably available in the response.output_item.added event. The transformer now captures this metadata keyed by output_index and uses it to correlate subsequent argument delta events.
  4. Reasoning/thinking stream support — Added handling for response.reasoning_summary_text.delta events so reasoning content from Codex models is streamed through as delta.thinking.content chunks.
  5. Debug logging — Added opt-in debug logging (const DEBUG = true) that writes to logs/transformer-debug.log. Covers request/response flow, SSE events, and emitted chunks. Off by default.
  6. Additional SSE events acknowledged — Events like response.reasoning_summary_part.*, response.output_item.done, response.output_text.done, and response.function_call_arguments.done are now explicitly handled instead of being passed through as unrecognized.
{
"LOG": false,
"LOG_LEVEL": "debug",
"CLAUDE_PATH": "",
"HOST": "127.0.0.1",
"PORT": 3456,
"APIKEY": "",
"API_TIMEOUT_MS": "600000",
"PROXY_URL": "",
"Plugins": [
{
"name": "token-speed",
"enabled": true
}
],
"transformers": [
{
"path": "~/.claude-code-router/plugins/copilot-transformer.js"
}
],
"Providers": [
{
"name": "copilot",
"api_base_url": "populated-by-transformer",
"api_key": "populated-by-transformer",
"models": [
"claude-sonnet-4.5",
"gpt-5.2",
"gpt-5.2-codex",
"claude-opus-4.6"
],
"transformer": {
"use": [
"copilot-transformer"
]
}
}
],
"StatusLine": {
"enabled": true,
"currentStyle": "default",
"default": {
"modules": [
{
"type": "workDir",
"icon": "󰉋",
"text": "{{workDirName}}",
"color": "bright_blue"
},
{
"type": "gitBranch",
"icon": "",
"text": "{{gitBranch}}",
"color": "bright_green"
},
{
"type": "model",
"icon": "󰚩",
"text": "{{model}}",
"color": "bright_yellow"
},
{
"type": "usage",
"icon": "",
"text": "󱦲{{inputTokens}}",
"color": "bright_magenta"
},
{
"type": "usage",
"icon": "",
"text": "󱦳{{outputTokens}}",
"color": "bright_magenta"
},
{
"type": "speed",
"icon": "",
"text": "󱐋{{tokenSpeed}}",
"color": "bright_green"
},
{
"type": "script",
"icon": "",
"text": "",
"color": "bright_cyan",
"scriptPath": "~/.claude-code-router/plugins/gsd-statusline-ccr.js",
"options": {
"debug": false,
"modelContextLimits": {
"codex": 272000,
"opus": 128000,
"gpt-5": 128000,
"sonnet": 128000,
"default": 128000
},
"showTodos": true,
"showGsdUpdate": true,
"showContextBar": true
}
}
]
},
"powerline": {
"modules": []
},
"fontFamily": "Hack Nerd Font Mono"
},
"Router": {
"default": "copilot,claude-opus-4.6",
"background": "copilot,claude-opus-4.6",
"think": "copilot,claude-opus-4.6",
"longContext": "copilot,gpt-5.2-codex",
"longContextThreshold": 127000,
"webSearch": "copilot,claude-sonnet-4.5",
"image": ""
},
"CUSTOM_ROUTER_PATH": "~/.claude-code-router/plugins/custom-router.js"
}
// /home/user/.claude-code-router/auth/github-copilot.js
const fs = require("fs");
const path = require("path");
class GitHubCopilotAuth {
constructor() {
this.CLIENT_ID = "01ab8ac9400c4e429b23";
this.DEVICE_CODE_URL = "https://github.com/login/device/code";
this.ACCESS_TOKEN_URL = "https://github.com/login/oauth/access_token";
this.COPILOT_API_KEY_URL =
"https://api.github.com/copilot_internal/v2/token";
this.TOKEN_FILE_PATH = process.env.COPILOT_TOKEN_FILE || path.join(
process.env.HOME || process.env.USERPROFILE,
".copilot-tokens.json"
);
}
async startDeviceFlow() {
const response = await fetch(this.DEVICE_CODE_URL, {
method: "POST",
headers: {
Accept: "application/json",
"Content-Type": "application/json",
"User-Agent": "GitHubCopilotChat/0.26.7",
},
body: JSON.stringify({
client_id: this.CLIENT_ID,
scope: "read:user",
}),
});
const data = await response.json();
return {
deviceCode: data.device_code,
userCode: data.user_code,
verificationUri: data.verification_uri,
interval: data.interval || 5,
expiresIn: data.expires_in,
};
}
async pollForToken(deviceCode) {
const response = await fetch(this.ACCESS_TOKEN_URL, {
method: "POST",
headers: {
Accept: "application/json",
"Content-Type": "application/json",
"User-Agent": "GitHubCopilotChat/0.26.7",
},
body: JSON.stringify({
client_id: this.CLIENT_ID,
device_code: deviceCode,
grant_type: "urn:ietf:params:oauth:grant-type:device_code",
}),
});
const data = await response.json();
if (data.access_token) {
return { success: true, accessToken: data.access_token };
}
if (data.error === "authorization_pending") {
return { pending: true };
}
return { error: data.error };
}
async getCopilotToken(githubAccessToken) {
const response = await fetch(this.COPILOT_API_KEY_URL, {
headers: {
Accept: "application/json",
Authorization: `Bearer ${githubAccessToken}`,
"User-Agent": "GitHubCopilotChat/0.26.7",
"Editor-Version": "vscode/1.99.3",
"Editor-Plugin-Version": "copilot-chat/0.26.7",
},
});
if (!response.ok) {
throw new Error(`Failed to get Copilot token: ${response.statusText}`);
}
const tokenData = await response.json();
return {
token: tokenData.token,
expiresAt: tokenData.expires_at,
endpoint:
tokenData.endpoints?.api ||
"https://copilot-proxy.githubusercontent.com/v1/engines/copilot-codex/completions",
};
}
isTokenExpired(bufferMinutes = 5) {
try {
const tokenFile = this.TOKEN_FILE_PATH;
if (!fs.existsSync(tokenFile)) {
return true;
}
const data = JSON.parse(fs.readFileSync(tokenFile, "utf8"));
if (!data.expiresAt) {
return true;
}
const now = Math.floor(Date.now() / 1000);
const bufferTime = bufferMinutes * 60;
return now >= data.expiresAt - bufferTime;
} catch (error) {
console.error("Error checking token expiration:", error);
return true;
}
}
getTokenFromFile() {
const tokenFile = this.TOKEN_FILE_PATH;
if (!fs.existsSync(tokenFile)) {
return;
}
const data = JSON.parse(fs.readFileSync(tokenFile, "utf8"));
return data;
}
updateTokenFile(tokenData) {
try {
const tokenFile = this.TOKEN_FILE_PATH;
fs.writeFileSync(tokenFile, JSON.stringify(tokenData, null, 2));
} catch (error) {
console.error("Error updating token files:", error);
}
}
async refreshCopilotToken() {
try {
const existingData = this.getTokenFromFile();
if (!existingData.githubToken) {
throw new Error("No GitHub token found. Please re-authenticate.");
}
console.log("Refreshing Copilot token...");
const copilotToken = await this.getCopilotToken(existingData.githubToken);
const tokenData = {
githubToken: existingData.githubToken,
copilotToken: copilotToken.token,
endpoint: `${copilotToken.endpoint}/chat/completions`,
expiresAt: copilotToken.expiresAt,
lastUpdated: new Date().toISOString(),
};
this.updateTokenFile(tokenData);
console.log("Copilot token refreshed successfully!");
return tokenData;
} catch (error) {
throw new Error(`Failed to refresh Copilot token: ${error.message}`);
}
}
}
module.exports = GitHubCopilotAuth;
const GitHubCopilotAuth = require("./copilot-auth");
const auth = new GitHubCopilotAuth();
async function setupCopilotAuth() {
console.log("Setting up GitHub Copilot authentication...\n");
const deviceFlow = await auth.startDeviceFlow();
console.log("📱 Please visit:", deviceFlow.verificationUri);
console.log("🔑 Enter this code:", deviceFlow.userCode);
console.log("\nWaiting for authorization...\n");
let attempts = 0;
const maxAttempts = deviceFlow.expiresIn / deviceFlow.interval;
while (attempts < maxAttempts) {
await new Promise((resolve) =>
setTimeout(resolve, deviceFlow.interval * 1000)
);
const result = await auth.pollForToken(deviceFlow.deviceCode);
if (result.success) {
console.log("GitHub OAuth successful!");
console.log("Getting Copilot session token...");
const copilotToken = await auth.getCopilotToken(result.accessToken);
const tokenData = {
githubToken: result.accessToken,
copilotToken: copilotToken.token,
endpoint: `${copilotToken.endpoint}/chat/completions`,
expiresAt: copilotToken.expiresAt,
lastUpdated: new Date().toISOString(),
};
auth.updateTokenFile(tokenData);
console.log("🔧 Token file updated automatically!");
console.log("Setup complete! Tokens saved.");
return;
}
if (result.error) {
console.error("Authentication failed:", result.error);
return;
}
if (!auth.isTokenExpired()) {
console.log("Token file is valid now. Exiting wait loop.");
}
attempts++;
process.stdout.write("⏳ ");
}
console.log("\n❌ Authentication timed out. Please try again.");
}
(async () => {
try {
if (!auth.isTokenExpired()) {
console.log("Existing Copilot token is still valid. No action needed.");
return;
} else {
// Try to refresh if possible
try {
await auth.refreshCopilotToken();
console.log("Copilot token refreshed.");
return;
} catch (refreshErr) {
console.log("Refresh failed or no credentials. Starting device authentication...");
}
await setupCopilotAuth();
}
} catch (err) {
console.error("Unexpected error during Copilot authentication:", err);
process.exit(1);
}
})()
const crypto = require("crypto");
const fs = require("fs");
const path = require("path");
const GitHubCopilotAuth = require("../auth/copilot-auth.js");
const DEBUG = false;
const LOG_FILE = path.join(__dirname, "..", "logs", "transformer-debug.log");
function dbg(...args) {
if (!DEBUG) return;
const ts = new Date().toISOString();
const line = `[${ts}] ${args.map(a => typeof a === "string" ? a : JSON.stringify(a, null, 2)).join(" ")}\n`;
try {
fs.mkdirSync(path.dirname(LOG_FILE), { recursive: true });
fs.appendFileSync(LOG_FILE, line);
} catch (e) { /* silent */ }
}
class CopilotTransformer {
name = "copilot-transformer";
constructor() {
this.auth = new GitHubCopilotAuth();
this.VERSION = "0.26.7";
this.EDITOR_VERSION = "vscode/1.103.2";
this.API_VERSION = "2025-04-01";
}
loadToken() {
return this.auth.getTokenFromFile()
}
copilotHeaders({ vision, isAgent }) {
const headers = {
"Copilot-Integration-ID": "vscode-chat",
"Editor-Plugin-Version": `copilot-chat/${this.VERSION}`,
"Editor-Version": this.EDITOR_VERSION,
"User-Agent": `GitHubCopilotChat/${this.VERSION}`,
"OpenAI-Intent": "conversation-panel",
"x-github-api-version": this.API_VERSION,
"X-Initiator": isAgent ? "agent" : "user",
"x-request-id": crypto.randomUUID(),
"x-vscode-user-agent-library-version": "electron-fetch",
"Content-Type": "application/json",
};
if (vision) {
headers["copilot-vision-request"] = "true";
}
return headers;
}
isResponsesModel(model) {
if (!model || typeof model !== "string") {
return false;
}
const normalized = model.toLowerCase();
return normalized.includes("codex");
}
shouldForceXHighReasoning(model) {
if (!model || typeof model !== "string") {
return false;
}
const normalized = model.toLowerCase();
return (
normalized.includes("5.2") ||
normalized.includes("5.3") ||
normalized.includes("5.1-codex-max")
);
}
resolveEndpoint(endpoint, useResponses) {
if (!endpoint || typeof endpoint !== "string") {
return endpoint;
}
const trimmed = endpoint.replace(/\/+$/, "");
if (useResponses) {
if (trimmed.endsWith("/responses")) {
return trimmed;
}
if (trimmed.endsWith("/chat/completions")) {
return trimmed.replace(/\/chat\/completions$/, "/responses");
}
if (trimmed.endsWith("/completions")) {
return trimmed.replace(/\/completions$/, "/responses");
}
return `${trimmed}/responses`;
}
if (trimmed.endsWith("/chat/completions")) {
return trimmed;
}
if (trimmed.endsWith("/responses")) {
return trimmed.replace(/\/responses$/, "/chat/completions");
}
return `${trimmed}/chat/completions`;
}
shortenCallId(id) {
if (!id || typeof id !== "string") return id;
if (id.length <= 64) return id;
// Deterministic shortening: keep prefix + hash suffix
const hash = crypto.createHash("sha256").update(id).digest("hex").slice(0, 16);
return `call_${hash}`;
}
normalizeRequestContent(content, role) {
const entry = { ...content };
delete entry.cache_control;
if (entry.type === "text") {
return {
type: role === "assistant" ? "output_text" : "input_text",
text: entry.text,
};
}
if (entry.type === "image_url") {
const normalized = {
type: role === "assistant" ? "output_image" : "input_image",
};
if (typeof entry.image_url?.url === "string") {
normalized.image_url = entry.image_url.url;
}
return normalized;
}
return null;
}
toResponsesRequest(request) {
const next = { ...request };
delete next.temperature;
delete next.max_tokens;
if (next.reasoning_effort) {
next.reasoning = {
...(next.reasoning || {}),
effort: next.reasoning_effort,
};
}
if (next.reasoning) {
next.reasoning = {
effort: next.reasoning.effort,
summary: "detailed",
};
}
delete next.reasoning_effort;
const input = [];
const systemMessages = (next.messages || []).filter(
(message) => message.role === "system"
);
if (systemMessages.length > 0) {
const firstSystem = systemMessages[0];
if (Array.isArray(firstSystem.content)) {
firstSystem.content.forEach((part) => {
let text = "";
if (typeof part === "string") {
text = part;
} else if (part && typeof part === "object" && "text" in part) {
text = part.text;
}
input.push({ role: "system", content: text });
});
} else {
next.instructions = firstSystem.content;
}
}
(next.messages || []).forEach((message) => {
if (message.role === "system") {
return;
}
const normalized = { ...message };
if (Array.isArray(normalized.content)) {
const parts = normalized.content
.map((part) => this.normalizeRequestContent(part, normalized.role))
.filter((part) => part !== null);
if (parts.length > 0) {
normalized.content = parts;
} else {
delete normalized.content;
}
}
if (normalized.role === "tool") {
const toolOutput = { ...normalized };
toolOutput.type = "function_call_output";
toolOutput.call_id = this.shortenCallId(normalized.tool_call_id);
toolOutput.output = normalized.content;
delete toolOutput.cache_control;
delete toolOutput.role;
delete toolOutput.tool_call_id;
delete toolOutput.content;
input.push(toolOutput);
return;
}
if (normalized.role === "assistant" && Array.isArray(normalized.tool_calls)) {
normalized.tool_calls.forEach((toolCall) => {
input.push({
type: "function_call",
arguments: toolCall.function.arguments,
name: toolCall.function.name,
call_id: this.shortenCallId(toolCall.id),
});
});
return;
}
// Strip fields that the Responses API rejects (e.g. leftover from
// a prior Chat Completions model in the same conversation)
delete normalized.thinking;
delete normalized.annotations;
delete normalized.logprobs;
delete normalized.cache_control;
input.push(normalized);
});
next.input = input;
delete next.messages;
if (Array.isArray(next.tools)) {
const webSearchTool = next.tools.find(
(tool) => tool.function.name === "web_search"
);
next.tools = next.tools
.filter((tool) => tool.function.name !== "web_search")
.map((tool) => {
if (tool.function.name === "WebSearch") {
delete tool.function.parameters.properties.allowed_domains;
}
if (tool.function.name === "Edit") {
return {
type: tool.type,
name: tool.function.name,
description: tool.function.description,
parameters: {
...tool.function.parameters,
required: [
"file_path",
"old_string",
"new_string",
"replace_all",
],
},
strict: true,
};
}
return {
type: tool.type,
name: tool.function.name,
description: tool.function.description,
parameters: tool.function.parameters,
};
});
if (webSearchTool) {
next.tools.push({ type: "web_search" });
}
}
next.parallel_tool_calls = false;
return next;
}
buildImageContent(data) {
if (!data || (!data.url && !data.b64_json)) {
return null;
}
return {
type: "image_url",
image_url: {
url: data.url || "",
b64_json: data.b64_json,
},
media_type: data.mime_type,
};
}
convertResponseToChat(response) {
const message = response.output?.find((item) => item.type === "message");
const functionCall = response.output?.find(
(item) => item.type === "function_call"
);
let annotations;
if (message?.content?.length && message.content[0].annotations) {
annotations = message.content[0].annotations.map((annotation) => ({
type: "url_citation",
url_citation: {
url: annotation.url || "",
title: annotation.title || "",
content: "",
start_index: annotation.start_index || 0,
end_index: annotation.end_index || 0,
},
}));
}
let content = null;
let toolCalls = null;
let thinking = null;
if (message?.reasoning) {
thinking = { content: message.reasoning };
}
if (message?.content) {
const textChunks = [];
const images = [];
message.content.forEach((part) => {
if (part.type === "output_text") {
textChunks.push(part.text || "");
return;
}
if (part.type === "output_image") {
const image = this.buildImageContent({
url: part.image_url,
mime_type: part.mime_type,
});
if (image) {
images.push(image);
}
return;
}
if (part.type === "output_image_base64") {
const image = this.buildImageContent({
b64_json: part.image_base64,
mime_type: part.mime_type,
});
if (image) {
images.push(image);
}
}
});
if (images.length > 0) {
const mixed = [];
if (textChunks.length > 0) {
mixed.push({ type: "text", text: textChunks.join("") });
}
mixed.push(...images);
content = mixed;
} else {
content = textChunks.join("");
}
}
if (functionCall) {
toolCalls = [
{
id: functionCall.call_id || functionCall.id,
function: {
name: functionCall.name,
arguments: functionCall.arguments,
},
type: "function",
},
];
}
return {
id: response.id || `chatcmpl-${Date.now()}`,
object: "chat.completion",
created: response.created_at,
model: response.model,
choices: [
{
index: 0,
message: {
role: "assistant",
content: content || null,
tool_calls: toolCalls,
thinking,
annotations,
},
logprobs: null,
finish_reason: toolCalls ? "tool_calls" : "stop",
},
],
usage: response.usage
? {
prompt_tokens: response.usage.input_tokens || 0,
completion_tokens: response.usage.output_tokens || 0,
total_tokens: response.usage.total_tokens || 0,
}
: null,
};
}
async transformRequestIn(request) {
if (this.auth.isTokenExpired()) {
try {
await this.auth.refreshCopilotToken();
} catch (error) {
throw new Error(
`Token refresh failed: ${error.message}.`
);
}
}
let tokenData = this.loadToken();
const messages = request.messages || [];
const vision = messages.some(
(m) =>
typeof m.content !== "string" &&
Array.isArray(m.content) &&
m.content.some((c) => c.type === "image_url")
);
const isAgent = messages.some((m) =>
["assistant", "tool"].includes(m.role)
);
const headers = this.copilotHeaders({ vision, isAgent });
headers.Authorization = `Bearer ${tokenData.copilotToken}`;
const model = request.model?.split(",").pop() || request.model;
const useResponses = this.isResponsesModel(model);
const endpoint = this.resolveEndpoint(tokenData.endpoint, useResponses);
dbg("=== REQUEST ===");
dbg("raw model:", request.model, "-> resolved:", model);
dbg("useResponses:", useResponses);
dbg("endpoint:", endpoint);
dbg("message count:", messages.length);
dbg("stream:", request.stream);
const body = {
...request,
model,
};
if (this.shouldForceXHighReasoning(model)) {
body.reasoning_effort = "xhigh";
dbg("forcing xhigh reasoning");
}
const finalBody = useResponses ? this.toResponsesRequest(body) : body;
// Log message roles and content types for debugging
if (useResponses && finalBody.input) {
dbg("input items:", finalBody.input.length);
finalBody.input.forEach((item, i) => {
const keys = Object.keys(item);
dbg(` input[${i}]:`, item.role || item.type || "unknown", "keys:", keys);
});
} else if (finalBody.messages) {
finalBody.messages.forEach((msg, i) => {
const contentType = typeof msg.content;
const extra = msg.thinking ? " +thinking" : "";
const ann = msg.annotations ? " +annotations" : "";
dbg(` msg[${i}]:`, msg.role, `content=${contentType}${extra}${ann}`);
});
}
dbg("final body keys:", Object.keys(finalBody));
return {
body: finalBody,
config: {
url: endpoint,
headers,
},
};
}
async transformResponseOut(response) {
const contentType = response.headers.get("Content-Type") || "";
dbg("=== RESPONSE ===");
dbg("status:", response.status, response.statusText);
dbg("content-type:", contentType);
if (contentType.includes("application/json")) {
const payload = await response.json();
dbg("JSON payload keys:", Object.keys(payload || {}));
if (payload?.error) {
dbg("ERROR from API:", payload.error);
}
if (payload?.object === "response" && payload?.output) {
const chatPayload = this.convertResponseToChat(payload);
return new Response(JSON.stringify(chatPayload), {
status: response.status,
statusText: response.statusText,
headers: response.headers,
});
}
return new Response(JSON.stringify(payload), {
status: response.status,
statusText: response.statusText,
headers: response.headers,
});
}
if (contentType.includes("text/event-stream")) {
if (!response.body) {
return response;
}
const decoder = new TextDecoder();
const encoder = new TextEncoder();
const state = {
id: null,
created: null,
model: null,
sentRole: false,
sawToolCall: false,
toolCallIndexById: new Map(),
toolCallCount: 0,
};
let buffer = "";
let sawDone = false;
let emitCount = 0;
const stream = new ReadableStream({
start: async (controller) => {
const reader = response.body.getReader();
const emitChunk = (delta, finishReason = null, usage = null) => {
emitCount++;
const chunk = {
id: state.id || `chatcmpl-${Date.now()}`,
object: "chat.completion.chunk",
created: state.created || Math.floor(Date.now() / 1000),
model: state.model || "copilot",
choices: [
{
index: 0,
delta: delta || {},
logprobs: null,
finish_reason: finishReason,
},
],
};
if (usage) {
chunk.usage = usage;
}
// Log first 5 emitted chunks, then every 20th, plus finish
if (emitCount <= 5 || emitCount % 20 === 0 || finishReason) {
const deltaKeys = Object.keys(delta || {});
const preview = delta?.content ? delta.content.substring(0, 50) : (delta?.tool_calls ? `tool_call[${delta.tool_calls[0]?.function?.name||'?'}]` : (delta?.thinking ? 'thinking...' : (delta?.role || '(empty)')));
dbg(`EMIT #${emitCount}: deltaKeys=${JSON.stringify(deltaKeys)} finish=${finishReason} preview=${preview}`);
}
controller.enqueue(
encoder.encode(`data: ${JSON.stringify(chunk)}\n\n`)
);
};
const emitRoleIfNeeded = () => {
if (!state.sentRole) {
emitChunk({ role: "assistant" });
state.sentRole = true;
}
};
const handleResponsesEvent = (payload) => {
if (payload.response) {
state.id = state.id || payload.response.id;
state.model = state.model || payload.response.model;
state.created = state.created || payload.response.created_at;
}
// --- Text output deltas ---
if (payload.type === "response.output_text.delta") {
if (payload.delta) {
emitRoleIfNeeded();
emitChunk({ content: payload.delta });
}
return true;
}
// --- Reasoning / thinking summary deltas ---
if (payload.type === "response.reasoning_summary_text.delta") {
if (payload.delta) {
emitRoleIfNeeded();
emitChunk({ thinking: { content: payload.delta } });
}
return true;
}
// --- Output item added (captures function call metadata) ---
if (payload.type === "response.output_item.added") {
const item = payload.item;
const outIdx = payload.output_index;
dbg("output_item.added type:", item?.type, "output_index:", outIdx);
if (item && item.type === "function_call") {
// Store call_id and name keyed by output_index
// (item_id is obfuscated and changes per-event, so
// it cannot be used for matching argument deltas).
const callId = item.call_id || item.id;
const name = item.name || "";
state.functionCallByOutputIndex = state.functionCallByOutputIndex || {};
state.functionCallByOutputIndex[outIdx] = { callId, name };
dbg("captured function_call info: output_index", outIdx, "->", callId, name);
}
return true;
}
// --- Function call argument deltas ---
if (payload.type === "response.function_call_arguments.delta") {
// Look up call_id and name by output_index (not item_id,
// which is obfuscated differently on every event).
const outIdx = payload.output_index;
const info = (state.functionCallByOutputIndex || {})[outIdx];
const callId = info?.callId || payload.call_id || `call_idx_${outIdx}`;
const name = info?.name || payload.name || "";
emitRoleIfNeeded();
let index = state.toolCallIndexById.get(callId);
if (index === undefined) {
index = state.toolCallCount;
state.toolCallCount += 1;
state.toolCallIndexById.set(callId, index);
}
state.sawToolCall = true;
emitChunk({
tool_calls: [
{
index,
id: callId,
type: "function",
function: {
name,
arguments: payload.delta || "",
},
},
],
});
return true;
}
// --- Completed (includes usage data) ---
if (payload.type === "response.completed") {
const respUsage = payload.response?.usage;
const mappedUsage = respUsage ? {
prompt_tokens: respUsage.input_tokens || 0,
completion_tokens: respUsage.output_tokens || 0,
total_tokens: respUsage.total_tokens || 0,
} : null;
emitChunk({}, state.sawToolCall ? "tool_calls" : "stop", mappedUsage);
return true;
}
// --- Events we acknowledge but don't need to emit ---
if (
payload.type === "response.created" ||
payload.type === "response.in_progress" ||
payload.type === "response.reasoning_summary_part.added" ||
payload.type === "response.reasoning_summary_part.done" ||
payload.type === "response.reasoning_summary_text.done" ||
payload.type === "response.output_item.done" ||
payload.type === "response.output_text.done" ||
payload.type === "response.function_call_arguments.done"
) {
if (payload.type === "response.function_call_arguments.done") {
dbg("func_call_args.done");
}
return true;
}
return false;
};
try {
while (true) {
const { done, value } = await reader.read();
if (done) {
dbg(`SSE: reader done, sawDone: ${sawDone}, totalEmitted: ${emitCount}, sawToolCall: ${state.sawToolCall}, model: ${state.model}`);
break;
}
buffer += decoder.decode(value, { stream: true });
const events = buffer.split(/\n\n/);
buffer = events.pop() || "";
for (const event of events) {
const lines = event.split(/\n/).filter((line) => line.length);
const dataLines = lines.filter((line) =>
line.startsWith("data:")
);
if (dataLines.length === 0) {
dbg("SSE: non-data event, lines:", lines.map(l => l.substring(0, 80)));
continue;
}
const data = dataLines
.map((line) => line.slice(5).trimStart())
.join("\n");
if (data === "[DONE]") {
sawDone = true;
controller.enqueue(encoder.encode("data: [DONE]\n\n"));
continue;
}
let payload;
try {
payload = JSON.parse(data);
} catch (error) {
dbg("SSE: JSON parse error:", error.message, "data:", data.substring(0, 200));
controller.enqueue(encoder.encode(`data: ${data}\n\n`));
continue;
}
dbg("SSE event:", payload?.type || payload?.object || "unknown", "keys:", Object.keys(payload || {}));
if (payload?.object === "chat.completion.chunk") {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify(payload)}\n\n`)
);
continue;
}
if (payload?.object === "response") {
const chatPayload = this.convertResponseToChat(payload);
const content = chatPayload.choices[0].message.content;
emitRoleIfNeeded();
if (typeof content === "string" && content.length > 0) {
emitChunk({ content });
}
emitChunk({}, chatPayload.choices[0].finish_reason, chatPayload.usage);
continue;
}
if (
payload?.type &&
typeof payload.type === "string" &&
payload.type.startsWith("response.")
) {
if (!state.id && payload.id) {
state.id = payload.id;
}
if (!state.model && payload.model) {
state.model = payload.model;
}
if (!state.created && payload.created_at) {
state.created = payload.created_at;
}
const handled = handleResponsesEvent(payload);
if (!handled) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify(payload)}\n\n`)
);
}
continue;
}
controller.enqueue(
encoder.encode(`data: ${JSON.stringify(payload)}\n\n`)
);
}
}
if (!sawDone) {
controller.enqueue(encoder.encode("data: [DONE]\n\n"));
}
} catch (error) {
dbg("SSE: stream error:", error.message, error.stack);
controller.error(error);
} finally {
try {
reader.releaseLock();
} catch (error) {
console.error("Error releasing reader lock:", error);
}
controller.close();
}
},
});
return new Response(stream, {
status: response.status,
statusText: response.statusText,
headers: {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
Connection: "keep-alive",
"Access-Control-Allow-Origin": "*",
},
});
}
return response;
}
}
module.exports = CopilotTransformer;
/**
* Custom Router for Claude Code Router (CCR)
*
* Fixes three issues with the built-in router:
*
* 1. COMMA BYPASS: When model is "copilot,gpt-5.2", the built-in router
* short-circuits and never checks longContext threshold. This router
* always checks longContext regardless of comma prefix.
*
* 2. THINKING TOKEN UNDERCOUNT: The built-in calculateTokenCount() skips
* "thinking" content blocks, undercounting by 30-60%. This router
* counts all content block types including thinking.
*
* 3. SUBAGENT MODEL INHERITANCE: Claude Code sends various unprefixed
* Claude models for subagents (haiku, opus, sonnet, etc.). The
* built-in router only maps haiku → Router.background (static) and
* lets others fall through. This router catches ALL unprefixed Claude
* models and dynamically inherits the current main model so subagents
* use whatever the user last set via /model.
*
* Loaded via CUSTOM_ROUTER_PATH in config.json.
* Signature: async function(req, config, { event }) → "provider,model" | null
*/
const fs = require("fs");
const path = require("path");
const DEBUG = true;
const LOG_FILE = path.join(__dirname, "..", "logs", "router-debug.log");
const STATE_FILE = path.join(require("os").tmpdir(), "ccr-router-state.json");
function dbg(...args) {
if (!DEBUG) return;
const ts = new Date().toISOString();
const line = `[${ts}] ${args
.map((a) => (typeof a === "string" ? a : JSON.stringify(a)))
.join(" ")}\n`;
try {
fs.mkdirSync(path.dirname(LOG_FILE), { recursive: true });
fs.appendFileSync(LOG_FILE, line);
} catch (e) {
/* silent */
}
}
// ---------------------------------------------------------------------------
// State: track the current main-conversation model across requests.
// This persists in memory for the lifetime of the CCR process (module cache).
// ---------------------------------------------------------------------------
let currentMainModel = null;
// ---------------------------------------------------------------------------
// Token counting — includes thinking blocks (fixes issue #2)
// ---------------------------------------------------------------------------
// ---------------------------------------------------------------------------
// Token estimation
// ---------------------------------------------------------------------------
// CCR bundles tiktoken as WASM in its cli.js — it's not requireable from
// external scripts. We only need to count the THINKING blocks that CCR's
// built-in calculateTokenCount() misses. For threshold decisions (~127K)
// a char-based estimate is more than sufficient.
//
// Average English text: ~4 chars/token with cl100k_base.
// Code/JSON: ~3.5 chars/token. We use 3.5 to err on the
// side of over-counting (safer for threshold decisions).
// ---------------------------------------------------------------------------
const CHARS_PER_TOKEN = 3.5;
function estimateTokens(text) {
if (!text) return 0;
return Math.ceil(text.length / CHARS_PER_TOKEN);
}
/**
* Count ONLY the thinking block tokens that CCR's built-in
* calculateTokenCount() misses. We add this to req.tokenCount
* (which already includes text, tool_use, tool_result, system, tools)
* rather than recounting everything from scratch.
*/
function countThinkingTokens(messages) {
let thinkingTokens = 0;
if (!Array.isArray(messages)) return 0;
for (const message of messages) {
if (!Array.isArray(message.content)) continue;
for (const part of message.content) {
if (part.type === "thinking" && part.thinking) {
thinkingTokens += estimateTokens(part.thinking);
}
}
}
return thinkingTokens;
}
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function isHaikuModel(model) {
if (!model || typeof model !== "string") return false;
const lower = model.toLowerCase();
return lower.includes("claude") && lower.includes("haiku");
}
/**
* Detect any unprefixed Claude model (opus, sonnet, haiku, etc.).
* Claude Code sends subagent requests with API-format names like
* "claude-opus-4-6" or "claude-sonnet-4-5-20250929" without a
* provider prefix. These should inherit the main model.
*/
function isUnprefixedClaudeModel(model) {
if (!model || typeof model !== "string") return false;
if (model.includes(",")) return false; // Already has provider prefix
return model.toLowerCase().startsWith("claude");
}
function isCommaModel(model) {
return model && typeof model === "string" && model.includes(",");
}
function hasWebSearchTool(tools) {
return (
Array.isArray(tools) &&
tools.some(
(t) =>
t.type?.startsWith("web_search") ||
t.function?.name === "web_search" ||
t.function?.name === "WebSearch"
)
);
}
// ---------------------------------------------------------------------------
// Main router function
// ---------------------------------------------------------------------------
/**
* @param {object} req - Fastify request object with req.body (Anthropic format),
* req.tokenCount (CCR's undercounted value), req.sessionId
* @param {object} config - Full CCR config (Providers, Router, etc.)
* @param {object} context - { event }
* @returns {string|null} "provider,model" string, or null to fall back to built-in
*/
async function customRouter(req, config, context) {
const Router = config.Router || {};
const body = req.body || {};
const originalModel = body.model || "";
// Use CCR's token count + our thinking block supplement
const ccrTokenCount = req.tokenCount || 0;
const thinkingExtra = countThinkingTokens(body.messages);
const totalTokenCount = ccrTokenCount + thinkingExtra;
const longContextThreshold = Router.longContextThreshold || 127000;
// Determine what kind of request this is
const isComma = isCommaModel(originalModel);
const isUnprefixedClaude = isUnprefixedClaudeModel(originalModel);
let baseModel = null;
let scenario = "unknown";
if (isComma) {
// Main conversation request — user set /model copilot,X
// Update the tracked current model so subagents inherit it
currentMainModel = originalModel;
baseModel = originalModel;
scenario = "main";
} else if (isUnprefixedClaude) {
// Subagent/background request — Claude Code sends unprefixed Claude
// models (haiku, opus, sonnet) for subagents. Inherit the current
// main model so they use whatever the user set via /model.
if (currentMainModel) {
baseModel = currentMainModel;
scenario = "subagent-inherited";
} else {
// No main model tracked yet (fresh CCR process), use Router.default
baseModel = Router.default || Router.background;
scenario = "subagent-default";
}
} else {
// Not comma-prefixed and not haiku — check if this is a known provider
// model missing its prefix (e.g. "gpt-5.2" instead of "copilot,gpt-5.2").
// Without intervention, CCR's built-in router silently falls back to
// Router.default, which is confusing and hard to debug.
const providers = config.Providers || [];
const knownModel = providers.some((p) =>
(p.models || []).some(
(m) => m.toLowerCase() === originalModel.toLowerCase()
)
);
if (knownModel) {
// Model exists in a provider but is missing the "provider," prefix.
// Throw an error so the user knows to use the full format.
// We don't auto-correct because with multiple providers, the same
// model name could be ambiguous.
const matchingProviders = providers
.filter((p) =>
(p.models || []).some(
(m) => m.toLowerCase() === originalModel.toLowerCase()
)
)
.map((p) => `${p.name},${originalModel}`);
const suggestion = matchingProviders.join(" or ");
const errMsg = `Invalid model format: "${originalModel}" is missing a provider prefix. Use: ${suggestion}`;
dbg("ERROR:", errMsg);
throw new Error(errMsg);
} else {
dbg(
"PASSTHROUGH:",
`model=${originalModel}`,
`tokens=${totalTokenCount}(ccr:${ccrTokenCount}+thinking:${thinkingExtra})`,
"(unknown model, falling through to built-in router)"
);
return null;
}
}
// Check longContext threshold (fixes issue #1 — comma bypass)
const isLongContext = totalTokenCount > longContextThreshold;
let finalModel = baseModel;
if (isLongContext && Router.longContext) {
finalModel = Router.longContext;
scenario += "+longContext";
} else if (hasWebSearchTool(body.tools) && Router.webSearch) {
// webSearch takes priority over think (matching built-in behavior)
finalModel = Router.webSearch;
scenario += "+webSearch";
}
// Persist the FINAL routed model (after longContext/webSearch overrides)
// so the statusline script uses the correct context limit for its bar.
if (isComma) {
try { fs.writeFileSync(STATE_FILE, JSON.stringify({ model: finalModel, ts: Date.now() })); } catch (e) { /* silent */ }
}
const thinkingPct =
ccrTokenCount > 0 && thinkingExtra > 0
? ` (+${Math.round((thinkingExtra / ccrTokenCount) * 100)}% thinking)`
: "";
dbg(
"ROUTE:",
`original=${originalModel}`,
`base=${baseModel}`,
`final=${finalModel}`,
`scenario=${scenario}`,
`tokens=${totalTokenCount}${thinkingPct}`,
`ccr=${ccrTokenCount}+thinking=${thinkingExtra}`,
`threshold=${longContextThreshold}`,
`longCtx=${isLongContext}`,
`msgs=${(body.messages || []).length}`,
`currentMain=${currentMainModel}`
);
return finalModel;
}
module.exports = customRouter;
#!/usr/bin/env node
// GSD Statusline - Adapted for Claude Code Router
// Original: https://github.com/glittercowboy/get-shit-done
// Adapted: Works as a CCR script module instead of reading from stdin
//
// Shows: model | current task | directory | context usage (adjusted for Copilot)
//
// Context limits are resolved dynamically per model. The script matches
// the current model name against a built-in table of GitHub Copilot limits.
// You can override or extend the table via options.modelContextLimits.
//
// Configuration (via options in config.json script module):
// modelContextLimits: object - Map of model substring -> input token limit.
// Partial, case-insensitive match against the
// current model name. First match wins; longer
// patterns are checked first for specificity.
// Built-in defaults (overridable):
// "codex" : 272000 (gpt-5.2-codex etc.)
// "opus" : 128000
// "gpt-5" : 128000
// "sonnet" : 128000
// "default": 128000 (fallback)
// showTodos: boolean - Show in-progress todo task. Default: true
// showGsdUpdate: boolean - Show GSD update indicator. Default: true
// showContextBar: boolean - Show context usage bar. Default: true
const fs = require('fs');
const path = require('path');
const os = require('os');
/**
* Parse a formatted token string back to a number.
* Handles: "24.1k" → 24100, "134" → 134, "1.2k" → 1200, "0" → 0
*/
function parseTokenCount(formatted) {
const s = (formatted || '0').trim().toLowerCase();
if (s.endsWith('k')) {
return Math.round(parseFloat(s) * 1000);
}
return parseInt(s, 10) || 0;
}
// Cache last known context percentage so the bar doesn't drop to 0%
// while the model is thinking (Claude Code sends cp=0 mid-stream).
// Uses a temp file since CCR may clear require cache between calls.
const _cacheFile = path.join(os.tmpdir(), 'gsd-ccr-ctx-cache.json');
// Shared state file written by the custom router with the current model.
// This updates on every API request, so the statusline always shows
// the correct model even before CCR's built-in variable updates.
const _routerStateFile = path.join(os.tmpdir(), 'ccr-router-state.json');
function _readRouterModel() {
try {
const d = JSON.parse(fs.readFileSync(_routerStateFile, 'utf8'));
// Expire after 30 minutes (stale session)
if (Date.now() - (d.ts || 0) > 1800000) return null;
if (!d.model) return null;
// Strip the "copilot," prefix for display
const parts = d.model.split(',');
return parts.length > 1 ? parts.slice(1).join(',') : d.model;
} catch { return null; }
}
function _readCache() {
try {
const d = JSON.parse(fs.readFileSync(_cacheFile, 'utf8'));
// Expire after 10 minutes (stale session)
if (Date.now() - (d.ts || 0) > 600000) return null;
return d;
} catch { return null; }
}
function _writeCache(used, source) {
try {
fs.writeFileSync(_cacheFile, JSON.stringify({ used, source, ts: Date.now() }));
} catch { /* silent */ }
}
// Built-in context limits for GitHub Copilot models (input tokens)
// Ordered by specificity: more specific patterns should be listed first.
const DEFAULT_MODEL_CONTEXT_LIMITS = {
'codex': 272000, // gpt-5.2-codex, gpt-5.1-codex, etc. — 272K input
'opus': 128000, // claude-opus-4.6 — 128K input
'gpt-5': 128000, // gpt-5.2, gpt-5.3, etc. — 128K input
'sonnet': 128000, // claude-sonnet-4.5 — 128K input
'default': 128000, // fallback for unknown models
};
/**
* Resolve the effective input context limit for the current model.
* Matches model name (case-insensitive) against the limits table.
* User-supplied overrides in options.modelContextLimits are merged
* on top of the defaults.
*/
function resolveContextLimit(modelName, userOverrides) {
const limits = { ...DEFAULT_MODEL_CONTEXT_LIMITS, ...(userOverrides || {}) };
const name = (modelName || '').toLowerCase();
// Sort keys by length descending so more specific patterns match first
// (e.g. "codex" before "gpt-5")
const keys = Object.keys(limits)
.filter(k => k !== 'default')
.sort((a, b) => b.length - a.length);
for (const key of keys) {
if (name.includes(key.toLowerCase())) {
return limits[key];
}
}
return limits['default'] || 128000;
}
/**
* CCR Script Module Entry Point
* @param {Record<string, string>} variables - Template variables from CCR statusline
* Available: model, workDirName, gitBranch, inputTokens, outputTokens,
* contextPercent, contextWindowSize, tokenSpeed, isStreaming,
* cost, duration, linesAdded, linesRemoved, sessionId
* @param {Record<string, any>} options - Custom options from config.json
* @returns {string} Formatted statusline text with ANSI colors
*/
module.exports = function gsdStatuslineCCR(variables, options) {
try {
variables = variables || {};
options = options || {};
const model = _readRouterModel() || variables.model || 'Claude';
const dirname = variables.workDirName || path.basename(process.cwd());
const sessionId = variables.sessionId || '';
// --- Options with defaults ---
const showTodos = options.showTodos !== false;
const showGsdUpdate = options.showGsdUpdate !== false;
const showContextBar = options.showContextBar !== false;
// Dynamically resolve the context limit for the current model
const effectiveContextLimit = resolveContextLimit(model, options.modelContextLimits);
// --- Context Window Display ---
// Claude Code tracks cumulative context usage internally via its
// context_window object. CCR exposes this as contextPercent (the
// percentage of the window used) and contextWindowSize.
//
// KEY INSIGHT: contextPercent is relative to Claude Code's own window
// size (typically 200K for Anthropic models). Auto-compact triggers
// based on THIS number, not on per-call API limits. So we display
// contextPercent DIRECTLY — no rescaling — so our bar matches the
// "Context left until auto-compact" indicator Claude Code shows.
//
// The 128K Copilot limit is a per-request input limit, not the
// context window for auto-compact purposes. Claude Code manages
// compaction based on its own tracking.
//
// Fallback: if contextPercent is 0 (e.g. first message), we
// estimate from inputTokens against the effective Copilot limit.
let ctx = '';
if (showContextBar) {
const ccrPercent = parseInt(variables.contextPercent || '0', 10);
const ccrWindowSize = parseTokenCount(variables.contextWindowSize);
const lastInput = parseTokenCount(variables.inputTokens);
let used = 0;
let source = '';
if (lastInput > 0) {
// Primary: calculate from actual input tokens vs. the current
// model's API limit. This correctly reflects the headroom when
// switching models (e.g. 128K opus → 272K codex).
used = Math.min(100, Math.round((lastInput / effectiveContextLimit) * 100));
source = 'in';
} else if (ccrPercent > 0) {
// Fallback: use Claude Code's own context percentage (scaled so
// 80% real = 100% displayed, matching auto-compact threshold).
used = Math.min(100, Math.round((ccrPercent / 80) * 100));
source = 'ctx';
}
// If we got nothing (mid-stream, thinking), show last known value
if (used === 0) {
const cached = _readCache();
if (cached && cached.used > 0) {
used = cached.used;
source = cached.source + '*'; // asterisk = cached value in debug
}
} else {
// Update cache with new real value
_writeCache(used, source);
}
// Clamp to 0-100
used = Math.max(0, Math.min(100, used));
// Build progress bar (10 segments)
const filled = Math.floor(used / 10);
const bar = '\u2588'.repeat(filled) + '\u2591'.repeat(10 - filled);
// Debug: show raw values so we can diagnose
const debug = options.debug ? ` \x1b[2m[${source}:cp=${ccrPercent},ws=${variables.contextWindowSize||'?'},in=${variables.inputTokens||'?'}]\x1b[0m` : '';
// Color thresholds (matched to original GSD breakpoints after 80% scaling)
if (used < 63) { // ~50% real usage
ctx = ` \x1b[32m${bar} ${used}%\x1b[0m${debug}`;
} else if (used < 81) { // ~65% real usage
ctx = ` \x1b[33m${bar} ${used}%\x1b[0m${debug}`;
} else if (used < 95) { // ~76% real usage
ctx = ` \x1b[38;5;208m${bar} ${used}%\x1b[0m${debug}`;
} else {
ctx = ` \x1b[5;31m\u{1F480} ${bar} ${used}%\x1b[0m${debug}`;
}
}
// --- Current Task from Todos ---
let task = '';
if (showTodos) {
const homeDir = os.homedir();
const todosDir = path.join(homeDir, '.claude', 'todos');
if (sessionId && fs.existsSync(todosDir)) {
try {
const files = fs.readdirSync(todosDir)
.filter(f => f.startsWith(sessionId) && f.includes('-agent-') && f.endsWith('.json'))
.map(f => ({ name: f, mtime: fs.statSync(path.join(todosDir, f)).mtime }))
.sort((a, b) => b.mtime - a.mtime);
if (files.length > 0) {
try {
const todos = JSON.parse(fs.readFileSync(path.join(todosDir, files[0].name), 'utf8'));
const inProgress = todos.find(t => t.status === 'in_progress');
if (inProgress) task = inProgress.activeForm || inProgress.title || '';
} catch (e) { /* silent */ }
}
} catch (e) {
// Silently fail on file system errors
}
}
}
// --- GSD Update Available? ---
let gsdUpdate = '';
if (showGsdUpdate) {
const homeDir = os.homedir();
const cacheFile = path.join(homeDir, '.claude', 'cache', 'gsd-update-check.json');
if (fs.existsSync(cacheFile)) {
try {
const cache = JSON.parse(fs.readFileSync(cacheFile, 'utf8'));
if (cache.update_available) {
gsdUpdate = '\x1b[33m\u2B06 /gsd:update\x1b[0m \u2502 ';
}
} catch (e) { /* silent */ }
}
}
// --- Compose Output ---
// Only output what CCR's built-in modules don't already show:
// - GSD update indicator
// - Current todo/task (bold)
// - Context bar (with corrected Copilot limits)
// Model, directory, git branch, token speed, and usage are handled
// by CCR's workDir, gitBranch, model, usage, and speed modules.
const parts = [];
if (gsdUpdate) {
parts.push(gsdUpdate);
}
// Task (bold, if present)
if (task) {
parts.push(`\x1b[1m${task}\x1b[0m`);
}
// Context bar
if (ctx) {
parts.push(ctx.trimStart());
}
// If nothing to show, return empty (CCR skips empty script output)
if (parts.length === 0) return '';
return parts.join(' \u2502 ');
} catch (e) {
// Silent fail - never break the statusline
return '';
}
};
/**
* Router Diagnostic Script
*
* Standalone diagnostic: run via `node router-diagnostic.js <path-to-ccr-log>`
* to analyze a CCR log file, OR import as a module to count tokens from
* Anthropic-format request bodies.
*
* This counts tokens the SAME WAY the CCR router does (via tiktoken cl100k_base)
* but ALSO counts thinking blocks separately, so we can see the gap.
*/
const fs = require("fs");
const path = require("path");
const LOG_FILE = path.join(__dirname, "..", "logs", "router-diagnostic.log");
function log(...args) {
const ts = new Date().toISOString();
const line = `[${ts}] ${args.map(a => typeof a === "string" ? a : JSON.stringify(a, null, 2)).join(" ")}\n`;
try {
fs.mkdirSync(path.dirname(LOG_FILE), { recursive: true });
fs.appendFileSync(LOG_FILE, line);
} catch (e) { /* silent */ }
console.log(line.trimEnd());
}
/**
* Estimate token count using a simple heuristic (chars / 3.5)
* This avoids requiring tiktoken as a dependency.
* For a proper count, use tiktoken directly.
*/
function estimateTokens(text) {
if (!text) return 0;
// Rough approximation: 1 token ≈ 3.5 characters for English text / code
return Math.ceil(text.length / 3.5);
}
/**
* Count tokens in an Anthropic-format request body, broken down by category.
* This mirrors what the CCR router's calculateTokenCount() does, plus counts
* the categories it misses.
*
* @param {object} body - The Anthropic-format request body
* @returns {object} Token breakdown
*/
function countTokensDetailed(body) {
const result = {
// What CCR router counts (text, tool_use, tool_result)
routerCounted: 0,
// What CCR router MISSES
thinkingTokens: 0,
thinkingBlocks: 0,
// Total
totalEstimated: 0,
// Breakdown
systemTokens: 0,
toolSchemaTokens: 0,
messageBreakdown: [],
};
// Count system prompt
if (typeof body.system === "string") {
const tokens = estimateTokens(body.system);
result.systemTokens = tokens;
result.routerCounted += tokens;
} else if (Array.isArray(body.system)) {
body.system.forEach((item) => {
if (item.type === "text" && typeof item.text === "string") {
const tokens = estimateTokens(item.text);
result.systemTokens += tokens;
result.routerCounted += tokens;
}
});
}
// Count tool schemas
if (Array.isArray(body.tools)) {
body.tools.forEach((tool) => {
let tokens = 0;
if (tool.description) {
tokens += estimateTokens(tool.name + tool.description);
}
if (tool.input_schema) {
tokens += estimateTokens(JSON.stringify(tool.input_schema));
}
result.toolSchemaTokens += tokens;
result.routerCounted += tokens;
});
}
// Count messages
if (Array.isArray(body.messages)) {
body.messages.forEach((msg, idx) => {
const msgInfo = {
index: idx,
role: msg.role,
routerTokens: 0,
thinkingTokens: 0,
contentTypes: [],
};
if (typeof msg.content === "string") {
const tokens = estimateTokens(msg.content);
msgInfo.routerTokens = tokens;
result.routerCounted += tokens;
msgInfo.contentTypes.push("string");
} else if (Array.isArray(msg.content)) {
msg.content.forEach((part) => {
msgInfo.contentTypes.push(part.type || "unknown");
if (part.type === "text") {
const tokens = estimateTokens(part.text);
msgInfo.routerTokens += tokens;
result.routerCounted += tokens;
} else if (part.type === "tool_use") {
const tokens = estimateTokens(JSON.stringify(part.input));
msgInfo.routerTokens += tokens;
result.routerCounted += tokens;
} else if (part.type === "tool_result") {
const text = typeof part.content === "string"
? part.content
: JSON.stringify(part.content);
const tokens = estimateTokens(text);
msgInfo.routerTokens += tokens;
result.routerCounted += tokens;
} else if (part.type === "thinking") {
// THIS IS WHAT THE ROUTER MISSES
const tokens = estimateTokens(part.thinking || "");
msgInfo.thinkingTokens += tokens;
result.thinkingTokens += tokens;
result.thinkingBlocks++;
}
// Images, signatures, etc. are also not counted but are smaller
});
}
result.messageBreakdown.push(msgInfo);
});
}
result.totalEstimated = result.routerCounted + result.thinkingTokens;
return result;
}
/**
* Analyze a single request and log the findings
*/
function analyzeRequest(body, label) {
const breakdown = countTokensDetailed(body);
const threshold = 127000; // from config
const routerDecision = breakdown.routerCounted > threshold ? "WOULD_SWITCH_TO_CODEX" : "STAYS_ON_DEFAULT";
const realDecision = breakdown.totalEstimated > threshold ? "SHOULD_SWITCH_TO_CODEX" : "OK_ON_DEFAULT";
const mismatch = routerDecision !== realDecision;
log("=== REQUEST ANALYSIS ===", label || "");
log(`Model requested: ${body.model}`);
log(`Messages: ${(body.messages || []).length}`);
log(`Thinking enabled: ${!!body.thinking}`);
log(`--- Token Estimates ---`);
log(`System: ${breakdown.systemTokens}`);
log(`Tool schemas: ${breakdown.toolSchemaTokens}`);
log(`Router would count: ${breakdown.routerCounted} (text + tool_use + tool_result)`);
log(`Thinking tokens MISSED: ${breakdown.thinkingTokens} (${breakdown.thinkingBlocks} blocks)`);
log(`Total estimated: ${breakdown.totalEstimated}`);
log(`--- Routing Decision ---`);
log(`Threshold: ${threshold}`);
log(`Router says: ${routerDecision} (sees ${breakdown.routerCounted})`);
log(`Reality: ${realDecision} (actual ~${breakdown.totalEstimated})`);
if (mismatch) {
log(`*** MISMATCH! Router undercounts by ${breakdown.thinkingTokens} tokens ***`);
log(`*** This request would hit the 128K API limit! ***`);
}
// Show per-message breakdown for messages with thinking
const thinkingMsgs = breakdown.messageBreakdown.filter(m => m.thinkingTokens > 0);
if (thinkingMsgs.length > 0) {
log(`--- Messages with thinking blocks ---`);
thinkingMsgs.forEach(m => {
log(` msg[${m.index}] role=${m.role}: router=${m.routerTokens}, thinking=${m.thinkingTokens}, types=${m.contentTypes.join(",")}`);
});
}
log("=== END ANALYSIS ===\n");
return { breakdown, routerDecision, realDecision, mismatch };
}
// If run directly, accept a CCR log path argument or analyze stdin
if (require.main === module) {
log("Router Diagnostic Tool");
log("This tool analyzes token counting discrepancies in CCR's router.");
log("To use: enable CCR logging (LOG: true), reproduce the issue, then");
log("examine the output in " + LOG_FILE);
log("");
log("For live monitoring, the copilot-transformer can be patched to call");
log("analyzeRequest(body) in transformRequestIn before the body is converted.");
log("");
// Quick self-test with a mock body
const mockBody = {
model: "claude-opus-4.6",
thinking: { type: "enabled", budget_tokens: 16384 },
system: [{ type: "text", text: "You are a helpful assistant. ".repeat(100) }],
tools: [],
messages: [
{ role: "user", content: "Hello" },
{
role: "assistant",
content: [
{ type: "thinking", thinking: "Let me think about this... ".repeat(2000) },
{ type: "text", text: "Hi there! How can I help?" }
]
},
{ role: "user", content: "Write me a long story" },
{
role: "assistant",
content: [
{ type: "thinking", thinking: "I need to write a creative story. ".repeat(3000) },
{ type: "text", text: "Once upon a time... ".repeat(500) }
]
},
],
};
log("--- Self-test with mock data ---");
const result = analyzeRequest(mockBody, "MOCK");
log(`Self-test result: mismatch=${result.mismatch}`);
}
module.exports = { countTokensDetailed, analyzeRequest, estimateTokens };
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment