Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

40,249

Full-text search

Active filters: 4-bit

hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4

Text Generation • 2B • Updated Aug 7, 2024 • 8.81k • 40

kaitchup/Mistral-Nemo-Base-2407-AutoRound-GPTQ-sym-4bit

Text Generation • 3B • Updated Aug 26, 2024 • 9 • 1

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4

Image-Text-to-Text • 13B • Updated Sep 24, 2024 • 326 • 29

Qwen/Qwen2.5-32B-Instruct-AWQ

Text Generation • 6B • Updated Oct 9, 2024 • 1.23M • 89

Qwen/Qwen2.5-Coder-7B-Instruct-AWQ

Text Generation • 2B • Updated Nov 18, 2024 • 508k • 18

unsloth/Qwen2.5-Coder-7B-bnb-4bit

Text Generation • 4B • Updated Nov 12, 2024 • 14k • 12

unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit

Image-to-Text • 6B • Updated Dec 10, 2024 • 512k • 80

SeanScripts/Llama-3.2-11B-Vision-Instruct-nf4

Image-Text-to-Text • 6B • Updated Sep 26, 2024 • 142 • 13

hugging-quants/gemma-2-9b-it-AWQ-INT4

Text Generation • 2B • Updated Oct 17, 2024 • 8.76k • 7

mlx-community/Ministral-8B-Instruct-2410-4bit

1B • Updated Oct 17, 2024 • 772 • 10

Orion-zhen/aya-expanse-8b-AWQ

Text Generation • 3B • Updated Oct 26, 2024 • 67 • 1

AIFunOver/glm-4-9b-chat-openvino-4bit

Updated Nov 9, 2024 • 8 • 1

shuyuej/Llama-3.3-70B-Instruct-GPTQ

11B • Updated Dec 22, 2024 • 1.15k • 6

unsloth/DeepSeek-R1-Distill-Llama-70B-bnb-4bit

Text Generation • 37B • Updated Feb 14 • 10.5k • 24

kaitchup/DeepSeek-R1-Distill-Qwen-14B-AutoRound-GPTQ-4bit

Text Generation • 3B • Updated Jan 27 • 77 • 7

wnma3mz/Janus-Pro-7B-4bit

Any-to-Any • Updated Feb 1 • 101 • 8

mlx-community/Saka-14B-4bit

Text Generation • 2B • Updated Feb 12 • 13 • 2

Qwen/Qwen2.5-VL-72B-Instruct-AWQ

Image-Text-to-Text • 13B • Updated Mar 7 • 54.4k • 69

Qwen/Qwen2.5-VL-7B-Instruct-AWQ

Image-Text-to-Text • 3B • Updated Apr 6 • 161k • 94

unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit

Text Generation • 2B • Updated Jul 31 • 9.07k • 21

mlx-community/Phi-4-mini-instruct-4bit

Text Generation • 0.6B • Updated Mar 5 • 1.69k • 1

ICEPVP8977/Uncensored_Qwen2.5_Coder_7B_4_bit_quantized_Seaftensors

8B • Updated Mar 19 • 6 • 1

gaunernst/gemma-3-27b-it-qat-autoawq

Image-Text-to-Text • 6B • Updated Apr 20 • 11.5k • 12

Qwen/Qwen3-14B-AWQ

Text Generation • 3B • Updated May 21 • 210k • 46

Qwen/Qwen2.5-Omni-7B-GPTQ-Int4

Any-to-Any • 5B • Updated May 15 • 229 • 12

Epitech/gemma3-1b-military_drone

0.7B • Updated May 18 • 2 • 1

Qwen/Qwen3-0.6B-MLX-4bit

Text Generation • 83.9M • Updated Jul 29 • 1.03k • 15

Qwen/Qwen3-4B-MLX-4bit

Text Generation • 0.6B • Updated Aug 29 • 73.5k • 19

Qwen/Qwen3-14B-MLX-4bit

Text Generation • 2B • Updated Jul 7 • 410 • 6

mlx-community/Qwen3-0.6B-4bit-DWQ-053125

Text Generation • 93.2M • Updated Jun 1 • 334 • 1