amd
/

DeepSeek-R1-MXFP4-Preview

8-bit precision

Model card Files Files and versions

linzhao-amd commited on Oct 20

Commit

f8a62be

·

verified ·

1 Parent(s): ce0cfef

Update README.md

Files changed (1) hide show

README.md +88 -0

README.md CHANGED Viewed

@@ -48,6 +48,94 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
 ### Use with SGLang
 This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) backend.
 # License
 Modifications Copyright(c) 2025 Advanced Micro Devices, Inc. All rights reserved.

 ### Use with SGLang
 This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) backend.
+## Evaluation
+The model was evaluated using [SGLang](https://docs.sglang.ai/) and [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) frameworks.
+### Accuracy
+<table>
+  <tr>
+   <td><strong>Benchmark</strong>
+   </td>
+   <td><strong>DeepSeek-R1 </strong>
+   </td>
+   <td><strong>DeepSeek-R1-MXFP4-ASQ(this model)</strong>
+   </td>
+   <td><strong>Recovery</strong>
+   </td>
+  </tr>
+  <tr>
+   <td>AIME24
+   </td>
+   <td>78.0
+   </td>
+   <td>76.0
+   </td>
+   <td>97.44%
+   </td>
+  </tr>
+  <tr>
+   <td>MMLU_COT
+   </td>
+   <td>79.90
+   </td>
+   <td>79.65
+   </td>
+   <td>99.69%
+   </td>
+  </tr>
+  <tr>
+   <td>GSM8K
+   </td>
+   <td>95.81
+   </td>
+   <td>95.42
+   </td>
+   <td>99.59%
+   </td>
+  </tr>
+</table>
+### Reproduction
+The results were obtained using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) with native evaluation task GSM8K, and custom task AIME24.
+### AIME24
+```
+lm_eval --model local-completions \
+    --model_args model=amd/DeepSeek-R1-MXFP4-ASQ,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=32000,temperature=0.6,top_p=0.95 \
+    --tasks aime24 \
+    --num_fewshot 0 \
+    --gen_kwargs "do_sample=True,temperature=0.6,top_p=0.95,max_tokens=32000" \
+    --batch_size auto \
+    --log_samples \
+    --output_path output_data/aime24 2>&1 | tee logs/aime24.log
+```
+### MMLU_COT
+```
+lm_eval --model local-completions \
+    --model_args model=amd/DeepSeek-R1-MXFP4-ASQ,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=32000,temperature=0.6,top_p=0.95 \
+    --tasks mmlu_cot \
+    --num_fewshot 0 \
+    --gen_kwargs "do_sample=True,temperature=0.6,top_p=0.95,max_tokens=32000" \
+    --batch_size auto \
+    --log_samples \
+    --output_path output_data/mmmlu_cot 2>&1 | tee logs/mmmlu_cot.log
+```
+### GSM8K
+```
+lm_eval --model local-completions \
+    --model_args model=amd/DeepSeek-R1-MXFP4-ASQ,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=8096 \
+    --tasks gsm8k \
+    --num_fewshot 5 \
+    --batch_size auto \
+    --log_samples \
+    --output_path output_data/gsm8k 2>&1 | tee logs/gsm8k.log
+```
 # License
 Modifications Copyright(c) 2025 Advanced Micro Devices, Inc. All rights reserved.