Jon Durbin jondurbin

GLM-4.6 GPQA diamond via chutes.ai

When tested properly, using the same GLM simple evals reference implementation provided by Z.ai, the evaluations resulted in the following scores:

{
  "chars": 970.0044191919192,
  "chars:std": 153.57443776558713,
  "Chemistry": 72.1774193548387,
 "Chemistry:std": 44.81252136132964,

Kimi-K2-Instruct vs GLM-4.6 (BFCL Tool Benchmark)

Raw data glm-4.6-results.tar.gz

Overall Performance

Benchmark	Kimi-K2-Instruct	GLM-4.6-FP8
Overall Accuracy	45.62%	60.13%
Latency Mean	3.32 s	6.66 s

MoonshotAI vs Chutes BFCL (tool) benchmark

Execution

git clone https://github.com/ShishirPatil/gorilla
cd gorilla/berkeley-function-call-leaderboard
python3 -m venv venv
./venv/bin/pip install -e .
# Apply diffs per provider

"easy" vllm endpoint

You can call this endpoint and it will automatically select the most recent vllm image:

curl -XPOST https://api.chutes.ai/chutes/vllm \
  -H 'content-type: application/json' \
   -H 'Authorization: cpk...' \
  -d '{
    "tagline": "Mistral 24b Instruct",
    "model": "unsloth/Mistral-Small-24B-Instruct-2501",
    "public": true,

	"""
	Simple OAuth2 test client for testing the IDP.

	First, you need to create the app:
	e.g.:
	curl -s -XPOST "https://api.chutes.ai/idp/apps" -H "Authorization: $CHUTES_API_KEY" -H "Content-Type: application/json" -d '{
	"name": "Test App",
	"description": "Test OAuth application",
	"redirect_uris": ["http://fakeapp.lvh.me:22221/callback"],
	"homepage_url": "http://fakeapp.lvh.me:22221",

	import json
	import requests
	import base64
	import openai
	import os

	client = openai.Client(base_url="https://llm.chutes.ai/v1", api_key=os.getenv("CHUTES_API_KEY"))

	prompt = """Please output the layout information from the PDF image, including each layout element's bbox, its category, and the corresponding text content within the bbox.

	import os
	import base64
	import openai
	import glob

	client = openai.Client(base_url="https://llm.chutes.ai/v1", api_key=os.environ["CHUTES_API_KEY"])

	image_base64s = []
	for path in glob.glob("/home/jdurbin/Downloads/logo*.png")[:8]:
	with open(path, "rb") as infile:

	import os
	import base64
	import openai
	import glob

	client = openai.Client(base_url="https://llm.chutes.ai/v1", api_key=os.environ["CHUTES_API_KEY"])

	image_base64s = []
	for path in glob.glob("/home/jdurbin/Downloads/logo*.png")[:8]:
	with open(path, "rb") as infile:

	import os
	import requests
	import base64

	audio = base64.b64encode(open("test.wav", "rb").read()).decode()
	result = requests.post(
	"https://chutes-spark-tts.chutes.ai/speak",
	json={
	"text": "How much wood would a woodchuck chuck if a woodchuck could chuck wood?",
	"sample_audio_b64": audio,