ISTA-DASLab
/

gemma-3-27b-it-GPTQ-4b-128g

Image-Text-to-Text

text-generation-inference

compressed-tensors

Model card Files Files and versions

SpiridonSunRotator commited on Mar 20

Commit

9c5f61a

·

verified ·

1 Parent(s): 1358186

Added evaluation metrics

Files changed (1) hide show

README.md +25 -0

README.md CHANGED Viewed

@@ -19,6 +19,31 @@ Only the weights of the linear operators within `language_model` transformers bl
 Model checkpoint is saved in [compressed_tensors](https://github.com/neuralmagic/compressed-tensors) format.
 ## Usage
 * To use the model in `transformers` update the package to stable release of Gemma3:

 Model checkpoint is saved in [compressed_tensors](https://github.com/neuralmagic/compressed-tensors) format.
+## Evaluation
+This model was evaluated on the OpenLLM v1 benchmarks. Model outputs were generated with the `vLLM` engine.
+| Model                      |  ArcC  |  GSM8k | Hellaswag |  MMLU  | TruthfulQA-mc2 | Winogrande | Average | Recovery |
+|----------------------------|:------:|:------:|:---------:|:------:|:--------------:|:----------:|:-------:|:--------:|
+| gemma-3-27b-it             | 0.7491 | 0.9181 |   0.8582  | 0.7742 |     0.6222     |   0.7908   |  0.7854 |  1.0000  |
+| gemma-3-27b-it-INT4 (this) | 0.7415 | 0.9174 |   0.8496  | 0.7662 |     0.6160     |   0.7956   |  0.7810 |  0.9944  |
+## Reproduction
+The results were obtained using the following commands:
+```bash
+MODEL=ISTA-DASLab/gemma-3-27b-it-GPTQ-4b-128g
+MODEL_ARGS="pretrained=$MODEL,max_model_len=4096,tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.80"
+lm_eval \
+  --model vllm \
+  --model_args $MODEL_ARGS \
+  --tasks openllm \
+  --batch_size auto
+```
 ## Usage
 * To use the model in `transformers` update the package to stable release of Gemma3: