bowenbaoamd commited on
Commit
ce0cfef
·
verified ·
1 Parent(s): 359d1ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -24,7 +24,7 @@ This model was built with deepseek-ai DeepSeek-R1 model by applying [AMD-Quark](
24
 
25
  # Model Quantization
26
 
27
- The model was quantized from [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) using [AMD-Quark](https://quark.docs.amd.com/latest/index.html). Both weights and activations were quantized to MXFP4 format, and the AutoSmoothQuant algorithm was applied to enhance accuracy.
28
 
29
  **Preprocessing requirement:**
30
 
@@ -37,7 +37,6 @@ cd Quark/examples/torch/language_modeling/llm_ptq/
37
  python3 quantize_quark.py --model_dir $MODEL_DIR \
38
  --quant_scheme w_mxfp4_a_mxfp4 \
39
  --group_size 32 \
40
- --kv_cache_dtype fp8 \
41
  --num_calib_data 128 \
42
  --exclude_layers "lm_head" \
43
  --multi_device \
 
24
 
25
  # Model Quantization
26
 
27
+ The model was quantized from [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) using [AMD-Quark](https://quark.docs.amd.com/latest/index.html). Both weights and activations were quantized to MXFP4 format.
28
 
29
  **Preprocessing requirement:**
30
 
 
37
  python3 quantize_quark.py --model_dir $MODEL_DIR \
38
  --quant_scheme w_mxfp4_a_mxfp4 \
39
  --group_size 32 \
 
40
  --num_calib_data 128 \
41
  --exclude_layers "lm_head" \
42
  --multi_device \