amd
/

DeepSeek-R1-MXFP4-Preview

8-bit precision

Model card Files Files and versions

linzhao-amd commited on Nov 6

Commit

c407d8d

·

verified ·

1 Parent(s): f1328be

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -41,7 +41,7 @@ python3 quantize_quark.py --model_dir $MODEL_DIR \
                           --exclude_layers "lm_head" \
                           --multi_device \
                           --model_export hf_format \
-                          --output_dir amd/DeepSeek-R1-MXFP4
 ```
 # Deployment
@@ -96,7 +96,7 @@ The result of AIME24 was obtained using [SGLang](https://docs.sglang.ai/) while
 ```
 # Launching server
 python3 -m sglang.launch_server \
-    --model /data/DeepSeek-R1-WMXFP4-AMXFP4-Scale-UINT8-Attn-MoE-Quant/ \
     --tp 8  \
     --trust-remote-code  \
     --n-share-experts-fusion 8 \

                           --exclude_layers "lm_head" \
                           --multi_device \
                           --model_export hf_format \
+                          --output_dir amd/DeepSeek-R1-MXFP4-Preview
 ```
 # Deployment
 ```
 # Launching server
 python3 -m sglang.launch_server \
+    --model amd/DeepSeek-R1-MXFP4-Preview \
     --tp 8  \
     --trust-remote-code  \
     --n-share-experts-fusion 8 \