allura-quants/allura-org_Tlacuilo-12B-EXL3
EXL3 quants of allura-org/Tlacuilo-12B using exllamav3 for quantization.
Quants
| Quant | BPW | Head Bits | Size (GB) |
|---|---|---|---|
| 2.5_H6 | 2.5 | 6 | 5.28 |
| 3.0_H6 | 3.0 | 6 | 5.96 |
| 3.5_H6 | 3.5 | 6 | 6.65 |
| 4.0_H6 | 4.0 | 6 | 7.33 |
| 4.5_H6 | 4.5 | 6 | 8.01 |
| 5.0_H6 | 5.0 | 6 | 8.69 |
| 6.0_H6 | 6.0 | 6 | 10.05 |
| 8.0_H8 | 8.0 | 8 | 12.95 |
How to Download and Use Quants
You can download quants by targeting specific size using the Hugging Face CLI.
Click for download commands
1. Install huggingface-cli:
pip install -U "huggingface_hub[cli]"
2. Download a specific quant:
huggingface-cli download allura-quants/allura-org_Tlacuilo-12B-EXL3 --revision "5.0bpw_H6" --local-dir ./
EXL3 quants can be run with any inference client that supports EXL3, such as TabbyAPI. Refer to documentation for set up instructions.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for allura-quants/allura-org_Tlacuilo-12B-EXL3
Base model
mistralai/Mistral-Nemo-Base-2407
Finetuned
LatitudeGames/Muse-12B
Finetuned
allura-org/Tlacuilo-12B