Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ This model was built with deepseek-ai DeepSeek-R1 model by applying [AMD-Quark](
|
|
| 24 |
|
| 25 |
# Model Quantization
|
| 26 |
|
| 27 |
-
The model was quantized from [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) using [AMD-Quark](https://quark.docs.amd.com/latest/index.html). Both weights and activations were quantized to MXFP4 format
|
| 28 |
|
| 29 |
**Preprocessing requirement:**
|
| 30 |
|
|
@@ -37,7 +37,6 @@ cd Quark/examples/torch/language_modeling/llm_ptq/
|
|
| 37 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
| 38 |
--quant_scheme w_mxfp4_a_mxfp4 \
|
| 39 |
--group_size 32 \
|
| 40 |
-
--kv_cache_dtype fp8 \
|
| 41 |
--num_calib_data 128 \
|
| 42 |
--exclude_layers "lm_head" \
|
| 43 |
--multi_device \
|
|
|
|
| 24 |
|
| 25 |
# Model Quantization
|
| 26 |
|
| 27 |
+
The model was quantized from [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) using [AMD-Quark](https://quark.docs.amd.com/latest/index.html). Both weights and activations were quantized to MXFP4 format.
|
| 28 |
|
| 29 |
**Preprocessing requirement:**
|
| 30 |
|
|
|
|
| 37 |
python3 quantize_quark.py --model_dir $MODEL_DIR \
|
| 38 |
--quant_scheme w_mxfp4_a_mxfp4 \
|
| 39 |
--group_size 32 \
|
|
|
|
| 40 |
--num_calib_data 128 \
|
| 41 |
--exclude_layers "lm_head" \
|
| 42 |
--multi_device \
|