Skip to content

Instantly share code, notes, and snippets.

@powderluv
Created February 12, 2026 21:19
Show Gist options
  • Select an option

  • Save powderluv/8156ec484215f11810532d4a84e7537d to your computer and use it in GitHub Desktop.

Select an option

Save powderluv/8156ec484215f11810532d4a84e7537d to your computer and use it in GitHub Desktop.
GLM5 on MI300X
Launch docker:
rocm/sgl-dev:v0.5.8.post1-rocm720-mi30x-20260211-preview
Inside container -
Install Sglang from source by following -
https://docs.sglang.io/platforms/amd_gpu.html#install-from-source
Install transformer from source -
pip install git+https://github.com/huggingface/transformers.git
Launch the serve -
python -m sglang.launch_server \
--model zai-org/GLM-5-FP8 \
--tp 8 \
--tool-call-parser glm47 \
--reasoning-parser glm45 \
--mem-fraction-static 0.8 \
--nsa-prefill-backend tilelang \
--nsa-decode-backend tilelang
GSM8K accuracy -
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.9545|± |0.0057|
| | |strict-match | 5|exact_match|↑ |0.9553|± |0.0057|
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment