Add files using upload-large-folder tool

Files changed (3) hide show

Qwen3-Coder-Next-MXFP4_MOE_BF16.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:d7cbd6f71169958c5118c02820016c73bb8ec68a91c07ab1b71a84a071aff1f1
+size 45910019872

Qwen3-Coder-Next-MXFP4_MOE_F16.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:dc685cda61b109e6885dc8f49e4f52b04003bc3ecc0b3349e4aaec38d534e9e2
+size 45910019872

README.md CHANGED Viewed

@@ -1,23 +1,19 @@
----
-pipeline_tag: text-generation
-base_model:
-- Qwen/Qwen3-Coder-Next
----
-This is a MXFP4_MOE quantization of the model [Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next)
-The suggested parameters are:
-```
-temperature=1.0
-top_p=0.95
-top_k=40
-```
-As of 2026-02-17 I have updated the model to a MXFP4 quant of higher quality.
-The mainline standard is:
-| tensors |  quant      |
-| ------- | ----------- |
-|1D       | unquantized |
-|other    | Q8_0        |
-|MoE      | MXFP4       |
-So I created a new variant, where the other tensors are bumped up from Q8 to FP16.

+---
+pipeline_tag: text-generation
+base_model:
+- Qwen/Qwen3-Coder-Next
+---
+This is a MXFP4_MOE quantization of the model [Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next)
+The suggested parameters from the official docs are:
+```
+temperature=1.0
+top_p=0.95
+top_k=40
+```
+As of 2026-02-17 I have updated the model to a MXFP4 quant of higher quality.
+The mainline standard is to use MXFP4 for the MoE tensors, and Q8 for the rest.
+So I created 2 new variants, where the other tensors are either BF16 or FP16 instead of Q8.
+The order of preference is BF16, then F16.
+On some architectures BF16 will be slower, but its the highest quality, essentialy its the original tensors from the model copied over unquantized.