Qwen3.5-397B-A17B-NVFP4

This is a quantized version of Qwen/Qwen3.5-397B-A17B using the NVFP4 quantization scheme.

Important

Needs this PR from VLLM to work: https://github.com/vllm-project/vllm/pull/34723 You might need to build from source as it is not included in the nightly build yet as I am writing this. Alternatively, patch the latest nightly image yourself to include that PR.

Note:

Reuploaded weights with some issues fixed (20/02/2026)

Creation

This model was created using VLLM's LLM Compressor with Qwen3.5 MoE support added via PR #2383. The PR adds a custom CalibrationQwen3MoeSparseMoeBlock that routes calibration data to all experts during quantization, ensuring every expert receives proper calibration for accurate NVFP4 quantization.

Downloads last month: 973

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Sehyo/Qwen3.5-397B-A17B-NVFP4

Base model

Qwen/Qwen3.5-397B-A17B

Quantized

(17)

this model