Skip to content

Instantly share code, notes, and snippets.

View Jank-Dankins's full-sized avatar

Jank-Dankins

  • Joined Feb 5, 2026
View GitHub Profile
@Maharshi-Pandya
Maharshi-Pandya / quant.py
Last active February 7, 2026 08:39
NVFP4 quantization in torch
import torch
FP8_AMAX = 448.0
FP8_DTYPE = torch.float8_e4m3fn
FP4_AMAX = 6.0
FP4_DTYPE = getattr(torch, "float4_e2m1fn_x2", torch.uint8)
# midpoints and the corresponding bins
# representable positives = [0.0, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 6.0]