GGUF quants of: https://huggingface.co/grimjim/gemma-3-12b-it-norm-preserved-biprojected-abliterated
Ctx limits per quant (RTX 3060 12GB, F16 k/v, no offload):
| quant | ctx | comment |
|---|---|---|
| Q2_K_S | 16k | |
| iQ3_S | 15k | |
| Q3_K_S | 15k | |
| iQ4_XS | 12k | |
| iQ4_NL | 10k | (16k with q8_0 k/v) |
| Q4_K_S | 10k | ( " ) |
| Q5_K_S | 8k | (14k with q8_0 k/v) |
- Downloads last month
- 254
Hardware compatibility
Log In
to view the estimation
We're not able to determine the quantization variants.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for RossAscends/grimjim-12B-Gemma-3-it-norm-preserved-biprojected-abliterated-GGUF
Base model
google/gemma-3-12b-pt
Finetuned
google/gemma-3-12b-it