RedHatAI
/

Mistral-Small-3.1-24B-Instruct-2503-quantized.w8a8

Image-Text-to-Text

compressed-tensors

8-bit precision

Model card Files Files and versions

alexmarques commited on Apr 15

Commit

22b5063

·

verified ·

1 Parent(s): a6e1a12

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -187,7 +187,7 @@ vLLM aslo supports OpenAI-compatible serving. See the [documentation](https://do
   # Configure the quantization algorithm and scheme
   recipe = [
-      SmoothQuantModifier(),
       GPTQModifier(
           ignore=["language_model.lm_head", "re:vision_tower.*", "re:multi_modal_projector.*"]
           sequential_targets=["MistralDecoderLayer"]

   # Configure the quantization algorithm and scheme
   recipe = [
+      SmoothQuantModifier(smoothing_strength=0.8),
       GPTQModifier(
           ignore=["language_model.lm_head", "re:vision_tower.*", "re:multi_modal_projector.*"]
           sequential_targets=["MistralDecoderLayer"]