linzhao-amd commited on
Commit
f8a62be
·
verified ·
1 Parent(s): ce0cfef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -0
README.md CHANGED
@@ -48,6 +48,94 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
48
  ### Use with SGLang
49
 
50
  This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) backend.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
  # License
53
  Modifications Copyright(c) 2025 Advanced Micro Devices, Inc. All rights reserved.
 
48
  ### Use with SGLang
49
 
50
  This model can be deployed efficiently using the [SGLang](https://docs.sglang.ai/) backend.
51
+ ## Evaluation
52
+
53
+ The model was evaluated using [SGLang](https://docs.sglang.ai/) and [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) frameworks.
54
+
55
+ ### Accuracy
56
+
57
+ <table>
58
+ <tr>
59
+ <td><strong>Benchmark</strong>
60
+ </td>
61
+ <td><strong>DeepSeek-R1 </strong>
62
+ </td>
63
+ <td><strong>DeepSeek-R1-MXFP4-ASQ(this model)</strong>
64
+ </td>
65
+ <td><strong>Recovery</strong>
66
+ </td>
67
+ </tr>
68
+ <tr>
69
+ <td>AIME24
70
+ </td>
71
+ <td>78.0
72
+ </td>
73
+ <td>76.0
74
+ </td>
75
+ <td>97.44%
76
+ </td>
77
+ </tr>
78
+ <tr>
79
+ <td>MMLU_COT
80
+ </td>
81
+ <td>79.90
82
+ </td>
83
+ <td>79.65
84
+ </td>
85
+ <td>99.69%
86
+ </td>
87
+ </tr>
88
+ <tr>
89
+ <td>GSM8K
90
+ </td>
91
+ <td>95.81
92
+ </td>
93
+ <td>95.42
94
+ </td>
95
+ <td>99.59%
96
+ </td>
97
+ </tr>
98
+ </table>
99
+
100
+
101
+ ### Reproduction
102
+
103
+ The results were obtained using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) with native evaluation task GSM8K, and custom task AIME24.
104
+
105
+ ### AIME24
106
+ ```
107
+ lm_eval --model local-completions \
108
+ --model_args model=amd/DeepSeek-R1-MXFP4-ASQ,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=32000,temperature=0.6,top_p=0.95 \
109
+ --tasks aime24 \
110
+ --num_fewshot 0 \
111
+ --gen_kwargs "do_sample=True,temperature=0.6,top_p=0.95,max_tokens=32000" \
112
+ --batch_size auto \
113
+ --log_samples \
114
+ --output_path output_data/aime24 2>&1 | tee logs/aime24.log
115
+ ```
116
+
117
+ ### MMLU_COT
118
+ ```
119
+ lm_eval --model local-completions \
120
+ --model_args model=amd/DeepSeek-R1-MXFP4-ASQ,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=32000,temperature=0.6,top_p=0.95 \
121
+ --tasks mmlu_cot \
122
+ --num_fewshot 0 \
123
+ --gen_kwargs "do_sample=True,temperature=0.6,top_p=0.95,max_tokens=32000" \
124
+ --batch_size auto \
125
+ --log_samples \
126
+ --output_path output_data/mmmlu_cot 2>&1 | tee logs/mmmlu_cot.log
127
+ ```
128
+
129
+ ### GSM8K
130
+ ```
131
+ lm_eval --model local-completions \
132
+ --model_args model=amd/DeepSeek-R1-MXFP4-ASQ,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=8096 \
133
+ --tasks gsm8k \
134
+ --num_fewshot 5 \
135
+ --batch_size auto \
136
+ --log_samples \
137
+ --output_path output_data/gsm8k 2>&1 | tee logs/gsm8k.log
138
+ ```
139
 
140
  # License
141
  Modifications Copyright(c) 2025 Advanced Micro Devices, Inc. All rights reserved.