abnormalmapstudio
/

Qwen3-Next-80B-A3B-Thinking-mxfp4-mlx

@@ -83,6 +83,7 @@ mlx_lm.generate --model "abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-mxfp4-mlx
 ## Benchmarks
 - Environment: Apple Silicon (isolated runs; one model in memory at a time).
 - Script: `scripts/bench/qwen_mxfp4_vs_int4.py` with `--runs 1 --max-new 256`.
 - Results (representative, single pass):
   - `abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-mxfp4-mlx`
     - gen_tok_s: ≈ 37.5 tok/s; ttft: ≈ 2.58 s; mem_active: ≈ 42.36 GB

 ## Benchmarks
 - Environment: Apple Silicon (isolated runs; one model in memory at a time).
 - Script: `scripts/bench/qwen_mxfp4_vs_int4.py` with `--runs 1 --max-new 256`.
+- Full JSON: [bench_results.json](./bench_results.json)
 - Results (representative, single pass):
   - `abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-mxfp4-mlx`
     - gen_tok_s: ≈ 37.5 tok/s; ttft: ≈ 2.58 s; mem_active: ≈ 42.36 GB