Commit
·
03e01fe
1
Parent(s):
2e9cc51
docs: link to bench_results.json in Benchmarks section
Browse files
README.md
CHANGED
|
@@ -83,6 +83,7 @@ mlx_lm.generate --model "abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-mxfp4-mlx
|
|
| 83 |
## Benchmarks
|
| 84 |
- Environment: Apple Silicon (isolated runs; one model in memory at a time).
|
| 85 |
- Script: `scripts/bench/qwen_mxfp4_vs_int4.py` with `--runs 1 --max-new 256`.
|
|
|
|
| 86 |
- Results (representative, single pass):
|
| 87 |
- `abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-mxfp4-mlx`
|
| 88 |
- gen_tok_s: ≈ 37.5 tok/s; ttft: ≈ 2.58 s; mem_active: ≈ 42.36 GB
|
|
|
|
| 83 |
## Benchmarks
|
| 84 |
- Environment: Apple Silicon (isolated runs; one model in memory at a time).
|
| 85 |
- Script: `scripts/bench/qwen_mxfp4_vs_int4.py` with `--runs 1 --max-new 256`.
|
| 86 |
+
- Full JSON: [bench_results.json](./bench_results.json)
|
| 87 |
- Results (representative, single pass):
|
| 88 |
- `abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-mxfp4-mlx`
|
| 89 |
- gen_tok_s: ≈ 37.5 tok/s; ttft: ≈ 2.58 s; mem_active: ≈ 42.36 GB
|