about:

Q8_0 and Q5_0_custom static quants for this merge. Also, an overall 4.8bpw quant for IK_Llama.cpp and Croco.cpp, targetting 48GB VRAM users.

Downloads last month
38
GGUF
Model size
71B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for NexesQuants/Llama-3.3-Nemotron-70B-Instruct-Abliterated-TA_v0.10-iMat-CQ-GGUF