Qwen3-Coder-480B-A35B-Instruct_MXFP4
This checkpoint is a variant of Qwen3-Coder-480B-A35B-Instruct, where expert weights have been quantized to MXFP4 format similarly to gpt-oss-20b and gpt-oss-120b.
For quantizing weights we used the function downcast_to_mxfp from triton-kernels.
The checkpoint might come with a small drop in accuracy, but has ~72% size reduction compared to the original BF16 checkpoint.
Accuracy Comparison
| Model | GSM8K (strict-match) | GSM8K (flexible-extract) |
|---|---|---|
| Qwen3-Coder-480B-A35B-Instruct (BF16) | 89.16% ± 0.86% | 90.52% ± 0.81% |
| Qwen3-Coder-480B-A35B-Instruct_MXFP4 | 89.99% ± 0.83% | 90.75% ± 0.80% |
Checkpoint Size
| Model | Size | Reduction |
|---|---|---|
| Qwen3-Coder-480B-A35B-Instruct (BF16) | 895 GB | - |
| Qwen3-Coder-480B-A35B-Instruct_MXFP4 | 255 GB | 72% smaller |
- Downloads last month
- 7