Add files using upload-large-folder tool
Browse files
README.md
CHANGED
|
@@ -18,11 +18,9 @@ This model was quantized to 3-bit using DWQ with mlx-lm version **0.28.4**.
|
|
| 18 |
| Dataset | `allenai/tulu-3-sft-mixture` |
|
| 19 |
| Initial validation loss | 0.146 |
|
| 20 |
| Final validation loss | 0.088 |
|
| 21 |
-
| Relative KL reduction | ≈40 %
|
| 22 |
| Tokens processed | ≈1.09 M |
|
| 23 |
|
| 24 |
-
<img src="minimax_3e-7.png" width="600" alt="Training loss curve">
|
| 25 |
-
|
| 26 |
## MMLU-PRO Benchmark
|
| 27 |
|
| 28 |
| Model | Score |
|
|
|
|
| 18 |
| Dataset | `allenai/tulu-3-sft-mixture` |
|
| 19 |
| Initial validation loss | 0.146 |
|
| 20 |
| Final validation loss | 0.088 |
|
| 21 |
+
| Relative KL reduction | ≈40 % |
|
| 22 |
| Tokens processed | ≈1.09 M |
|
| 23 |
|
|
|
|
|
|
|
| 24 |
## MMLU-PRO Benchmark
|
| 25 |
|
| 26 |
| Model | Score |
|