Qwen3.5-397B-A17B-NVFP4

This is a quantized version of Qwen/Qwen3.5-397B-A17B using the NVFP4 quantization scheme.

Important

Needs this PR from VLLM to work: https://github.com/vllm-project/vllm/pull/34723 You might need to build from source as it is not included in the nightly build yet as I am writing this. Alternatively, patch the latest nightly image yourself to include that PR.

Note:

Reuploaded weights with some issues fixed (20/02/2026)

Creation

This model was created using VLLM's LLM Compressor with Qwen3.5 MoE support added via PR #2383. The PR adds a custom CalibrationQwen3MoeSparseMoeBlock that routes calibration data to all experts during quantization, ensuring every expert receives proper calibration for accurate NVFP4 quantization.

Downloads last month
973
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Sehyo/Qwen3.5-397B-A17B-NVFP4

Quantized
(17)
this model