QuantTrio
/

MiniMax-M2.5-AWQ

Text Generation

4-bit precision

Model card Files Files and versions

JunHowie commited on 5 days ago

Commit

bbe7387

·

verified ·

1 Parent(s): a7b7390

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -8,12 +8,12 @@ tags:
 - vLLM
 - AWQ
 base_model:
-  - MiniMax/MiniMax-M2.5
 base_model_relation: quantized
 ---
 # MiniMax-M2.5-AWQ
-Base model: [MiniMax/MiniMax-M2.5](https://huggingface.co/MiniMax/MiniMax-M2.5)
 This repo quantizes the model using data-free quantization (no calibration dataset required).
@@ -44,7 +44,7 @@ export VLLM_USE_FLASHINFER_SAMPLER=0
 export OMP_NUM_THREADS=4
 vllm serve \
-    __YOUR_PATH__/tclf90/MiniMax-M2.5-AWQ \
     --served-model-name MY_MODEL \
     --swap-space 16 \
     --max-num-seqs 32 \
@@ -73,8 +73,8 @@ vllm serve \
 ### 【Model Download】
 ```python
-from modelscope import snapshot_download
-snapshot_download('tclf90/MiniMax-M2.5-AWQ', cache_dir="your_local_path")
 ```
 ### 【Overview】

 - vLLM
 - AWQ
 base_model:
+  - MiniMaxAI/MiniMax-M2.5
 base_model_relation: quantized
 ---
 # MiniMax-M2.5-AWQ
+Base model: [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5)
 This repo quantizes the model using data-free quantization (no calibration dataset required).
 export OMP_NUM_THREADS=4
 vllm serve \
+    __YOUR_PATH__/QuantTrio/MiniMax-M2.5-AWQ \
     --served-model-name MY_MODEL \
     --swap-space 16 \
     --max-num-seqs 32 \
 ### 【Model Download】
 ```python
+from huggingface_hub import snapshot_download
+snapshot_download('QuantTrio/MiniMax-M2.5-AWQ', cache_dir="your_local_path")
 ```
 ### 【Overview】