Update README.md
Browse files
README.md
CHANGED
|
@@ -90,10 +90,19 @@ The model was evaluated using [SGLang](https://docs.sglang.ai/) and [lm-evaluati
|
|
| 90 |
|
| 91 |
### Reproduction
|
| 92 |
|
| 93 |
-
The
|
| 94 |
|
| 95 |
### AIME24
|
| 96 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
lm_eval --model local-completions \
|
| 98 |
--model_args model=amd/DeepSeek-R1-MXFP4-Preview,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=32000,temperature=0.6,top_p=0.95 \
|
| 99 |
--tasks aime24 \
|
|
|
|
| 90 |
|
| 91 |
### Reproduction
|
| 92 |
|
| 93 |
+
The result of AIME24 was obtained using [SGLang](https://docs.sglang.ai/) while result of GSM8K were obtained using [vLLM](https://docs.vllm.ai/en/latest/). All the evaluations were conducted via forked [lm-evaluation-harness](https://github.com/BowenBao/lm-evaluation-harness/tree/cot).
|
| 94 |
|
| 95 |
### AIME24
|
| 96 |
```
|
| 97 |
+
# Launching server
|
| 98 |
+
python3 -m sglang.launch_server \
|
| 99 |
+
--model /data/DeepSeek-R1-WMXFP4-AMXFP4-Scale-UINT8-Attn-MoE-Quant/ \
|
| 100 |
+
--tp 8 \
|
| 101 |
+
--trust-remote-code \
|
| 102 |
+
--n-share-experts-fusion 8 \
|
| 103 |
+
--disable-radix-cache
|
| 104 |
+
|
| 105 |
+
# Evaluating
|
| 106 |
lm_eval --model local-completions \
|
| 107 |
--model_args model=amd/DeepSeek-R1-MXFP4-Preview,base_url=http://localhost:30000/v1/completions,num_concurrent=999999,timeout=999999,tokenized_requests=False,max_length=32000,temperature=0.6,top_p=0.95 \
|
| 108 |
--tasks aime24 \
|