Update README.md
Browse files
README.md
CHANGED
|
@@ -37,36 +37,43 @@ Reasoning: medium
|
|
| 37 |
|
| 38 |
## Download a file (not the whole branch) from below:
|
| 39 |
|
|
|
|
|
|
|
| 40 |
| Filename | Quant type | File Size | Split | Description |
|
| 41 |
| -------- | ---------- | --------- | ----- | ----------- |
|
| 42 |
| [gpt-oss-20b-MXFP4.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-MXFP4.gguf) | MXFP4 | 12.1GB | false | Full MXFP4 weights, *recommended* for this model. |
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
|
| 49 |
-
|
|
| 50 |
-
| [gpt-oss-20b-
|
| 51 |
-
| [gpt-oss-20b-
|
| 52 |
-
| [gpt-oss-20b-
|
| 53 |
-
| [gpt-oss-20b-
|
| 54 |
-
| [gpt-oss-20b-
|
| 55 |
-
| [gpt-oss-20b-
|
| 56 |
-
| [gpt-oss-20b-
|
| 57 |
-
| [gpt-oss-20b-
|
| 58 |
-
| [gpt-oss-20b-
|
| 59 |
-
| [gpt-oss-20b-
|
| 60 |
-
| [gpt-oss-20b-
|
| 61 |
-
| [gpt-oss-20b-
|
| 62 |
-
| [gpt-oss-20b-
|
| 63 |
-
| [gpt-oss-20b-
|
| 64 |
-
| [gpt-oss-20b-
|
| 65 |
-
| [gpt-oss-20b-
|
| 66 |
-
| [gpt-oss-20b-
|
| 67 |
-
| [gpt-oss-20b-
|
| 68 |
-
| [gpt-oss-20b-
|
| 69 |
-
| [gpt-oss-20b-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
## Embed/output weights
|
| 72 |
|
|
|
|
| 37 |
|
| 38 |
## Download a file (not the whole branch) from below:
|
| 39 |
|
| 40 |
+
Use this one:
|
| 41 |
+
|
| 42 |
| Filename | Quant type | File Size | Split | Description |
|
| 43 |
| -------- | ---------- | --------- | ----- | ----------- |
|
| 44 |
| [gpt-oss-20b-MXFP4.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-MXFP4.gguf) | MXFP4 | 12.1GB | false | Full MXFP4 weights, *recommended* for this model. |
|
| 45 |
+
|
| 46 |
+
The reason is, the FFN (feed forward networks) of gpt-oss do not behave nicely when quantized to anything other than MXFP4, so they are kept at that level for everything.
|
| 47 |
+
|
| 48 |
+
The rest of these are provided for your own interest in case you feel like experimenting, but the size savings is basically non-existent so I would not recommend running them, they are provided simply for show:
|
| 49 |
+
|
| 50 |
+
| Filename | Quant type | File Size | Split | Description |
|
| 51 |
+
| -------- | ---------- | --------- | ----- | ----------- |
|
| 52 |
+
| [gpt-oss-20b-Q6_K_L.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q6_K_L.gguf) | Q6_K_L | 12.04GB | false | Uses Q8_0 for embed and output weights. Q6_K with all FFN kept at MXFP4_MOE. |
|
| 53 |
+
| [gpt-oss-20b-Q6_K.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q6_K.gguf) | Q6_K | 12.04GB | false | Q6_K with all FFN kept at MXFP4_MOE. |
|
| 54 |
+
| [gpt-oss-20b-Q5_K_L.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q5_K_L.gguf) | Q5_K_L | 11.91GB | false | Uses Q8_0 for embed and output weights. Q5_K with all FFN kept at MXFP4_MOE. |
|
| 55 |
+
| [gpt-oss-20b-Q4_K_L.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q4_K_L.gguf) | Q4_K_L | 11.89GB | false | Uses Q8_0 for embed and output weights. Q4_K with all FFN kept at MXFP4_MOE. |
|
| 56 |
+
| [gpt-oss-20b-Q2_K_L.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q2_K_L.gguf) | Q2_K_L | 11.85GB | false | Uses Q8_0 for embed and output weights. Q2_K with all FFN kept at MXFP4_MOE. |
|
| 57 |
+
| [gpt-oss-20b-Q3_K_XL.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q3_K_XL.gguf) | Q3_K_XL | 11.78GB | false | Uses Q8_0 for embed and output weights. Q3_K_L with all FFN kept at MXFP4_MOE. |
|
| 58 |
+
| [gpt-oss-20b-Q5_K_M.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q5_K_M.gguf) | Q5_K_M | 11.73GB | false | Q5_K_M with all FFN kept at MXFP4_MOE. |
|
| 59 |
+
| [gpt-oss-20b-Q5_K_S.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q5_K_S.gguf) | Q5_K_S | 11.72GB | false | Q5_K_S with all FFN kept at MXFP4_MOE. |
|
| 60 |
+
| [gpt-oss-20b-Q4_K_M.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q4_K_M.gguf) | Q4_K_M | 11.67GB | false | Q4_K_M with all FFN kept at MXFP4_MOE. |
|
| 61 |
+
| [gpt-oss-20b-Q4_K_S.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q4_K_S.gguf) | Q4_K_S | 11.67GB | false | Q4_K_S with all FFN kept at MXFP4_MOE. |
|
| 62 |
+
| [gpt-oss-20b-Q4_1.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q4_1.gguf) | Q4_1 | 11.59GB | false | Q4_1 with all FFN kept at MXFP4_MOE. |
|
| 63 |
+
| [gpt-oss-20b-IQ4_NL.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ4_NL.gguf) | IQ4_NL | 11.56GB | false | IQ4_NL with all FFN kept at MXFP4_MOE. |
|
| 64 |
+
| [gpt-oss-20b-IQ4_XS.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ4_XS.gguf) | IQ4_XS | 11.56GB | false | IQ4_XS with all FFN kept at MXFP4_MOE. |
|
| 65 |
+
| [gpt-oss-20b-Q3_K_M.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q3_K_M.gguf) | Q3_K_M | 11.56GB | false | Q3_K_M with all FFN kept at MXFP4_MOE. |
|
| 66 |
+
| [gpt-oss-20b-IQ3_M.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ3_M.gguf) | IQ3_M | 11.56GB | false | IQ3_M with all FFN kept at MXFP4_MOE. |
|
| 67 |
+
| [gpt-oss-20b-IQ3_XS.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ3_XS.gguf) | IQ3_XS | 11.56GB | false | IQ3_XS with all FFN kept at MXFP4_MOE. |
|
| 68 |
+
| [gpt-oss-20b-IQ3_XXS.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ3_XXS.gguf) | IQ3_XXS | 11.56GB | false | IQ3_XXS with all FFN kept at MXFP4_MOE. |
|
| 69 |
+
| [gpt-oss-20b-Q2_K.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q2_K.gguf) | Q2_K | 11.56GB | false | Q2_K with all FFN kept at MXFP4_MOE. |
|
| 70 |
+
| [gpt-oss-20b-Q3_K_S.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q3_K_S.gguf) | Q3_K_S | 11.55GB | false | Q3_K_S with all FFN kept at MXFP4_MOE. |
|
| 71 |
+
| [gpt-oss-20b-IQ2_M.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ2_M.gguf) | IQ2_M | 11.55GB | false | IQ2_M with all FFN kept at MXFP4_MOE. |
|
| 72 |
+
| [gpt-oss-20b-IQ2_S.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ2_S.gguf) | IQ2_S | 11.55GB | false | IQ2_S with all FFN kept at MXFP4_MOE. |
|
| 73 |
+
| [gpt-oss-20b-Q4_0.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q4_0.gguf) | Q4_0 | 11.52GB | false | Q4_0 with all FFN kept at MXFP4_MOE. |
|
| 74 |
+
| [gpt-oss-20b-IQ2_XS.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ2_XS.gguf) | IQ2_XS | 11.51GB | false | IQ2_XS with all FFN kept at MXFP4_MOE. |
|
| 75 |
+
| [gpt-oss-20b-IQ2_XXS.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-IQ2_XXS.gguf) | IQ2_XXS | 11.51GB | false | IQ2_XXS with all FFN kept at MXFP4_MOE. |
|
| 76 |
+
| [gpt-oss-20b-Q3_K_L.gguf](https://huggingface.co/bartowski/openai_gpt-oss-20b-GGUF/blob/main/openai_gpt-oss-20b-Q3_K_L.gguf) | Q3_K_L | 11.49GB | false | Q3_K_L with all FFN kept at MXFP4_MOE. |
|
| 77 |
|
| 78 |
## Embed/output weights
|
| 79 |
|