swayamsingal
/

tencent-Hunyuan-MT-7B-medium-nanoquant-medium

Safetensors

hunyuan_v1_dense

Model card Files Files and versions

xet

Community

swayamsingal commited on Sep 3

Commit

bbb8b3e

verified ·

1 Parent(s): 933fa99

Add model card with deployment instructions

Browse files

Files changed (1) hide show

README.md +68 -0

README.md ADDED Viewed

	@@ -0,0 +1,68 @@

+---
+language: en
+tags:
+- llm
+- compression
+- nanoquant
+- quantization
+- pruning
+license: apache-2.0
+datasets: []
+model-index: []
+---
+# NanoQuant Compressed Model
+## Model Description
+This is a compressed version of [tencent/Hunyuan-MT-7B](https://huggingface.co/tencent/Hunyuan-MT-7B)
+created using NanoQuant, an advanced LLM compression toolkit.
+## Compression Details
+- **Compression Level**: medium
+- **Size Reduction**: 77.0%
+- **Techniques Used**:
+  - Quantization: 8bit
+  - Pruning: magnitude
+  - LoRA: {'r': 32, 'alpha': 32, 'dropout': 0.1}
+## Deployment Options
+### Option 1: Direct Usage with Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("tencent_Hunyuan-MT-7B_nanoquant_medium")
+tokenizer = AutoTokenizer.from_pretrained("tencent_Hunyuan-MT-7B_nanoquant_medium")
+```
+### Option 2: Ollama Deployment
+This model is also available for Ollama:
+```bash
+ollama pull nanoquant-tencent-Hunyuan-MT-7B:medium
+```
+## Performance Characteristics
+Due to the compression, this model:
+- Requires significantly less storage space
+- Has faster loading times
+- Uses less memory during inference
+- Maintains most of the original model's capabilities
+## Original Model
+For information about the original model, please visit: https://huggingface.co/tencent/Hunyuan-MT-7B
+## License
+This model is released under the Apache 2.0 license.
+## NanoQuant
+NanoQuant is an advanced model compression system that achieves up to 99.95% size reduction while maintaining model performance.
+Learn more at [NanoQuant Documentation](https://github.com/nanoquant/nanoquant).