Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

40,251

Full-text search

Active filters: 4-bit

NexaAI/Qwen3-1.7B-4bit-MLX

Text Generation • 0.3B • Updated Jul 22 • 16 • 1

unsloth/Devstral-Small-2507-unsloth-bnb-4bit

Text Generation • 24B • Updated Jul 10 • 6.95k • 5

TechxGenus/Qwen3-Coder-480B-A35B-Instruct-AWQ

Text Generation • 70B • Updated Aug 2 • 32 • 1

QuantTrio/GLM-4.5-Air-GPTQ-Int4-Int8Mix

Text Generation • 20B • Updated Sep 5 • 3.29k • 10

Intel/Qwen3-30B-A3B-Thinking-2507-int4-AutoRound

0.6B • Updated Sep 19 • 112 • 11

AlekseyCalvin/QWEN_IMAGE_fp4_w_AbliteratedTE_Diffusers

Text-to-Image • Updated Aug 6 • 51 • 7

mlx-community/gpt-oss-120b-4bit

Text Generation • 117B • Updated Aug 6 • 3.11k • 6

helizac/dots.ocr-4bit

Image-to-Text • 2B • Updated Aug 6 • 515 • 28

mlx-community/Qwen3-4B-Instruct-2507-4bit

Text Generation • 0.6B • Updated Aug 6 • 2.76k • 6

QuantTrio/GLM-4.5V-AWQ

Image-Text-to-Text • 17B • Updated Aug 25 • 2.21k • 19

melvindave/gemma270m-chess-ft

Text Generation • 0.2B • Updated Aug 21 • 7 • 1

QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int4

Text Generation • 6B • Updated Sep 15 • 95 • 5

unsloth/Qwen3-Next-80B-A3B-Instruct-bnb-4bit

Text Generation • Updated Sep 13 • 21k • 21

QuantTrio/DeepSeek-V3.2-Exp-AWQ-Lite

Text Generation • Updated Oct 1 • 69 • 3

QuantTrio/Qwen3-VL-30B-A3B-Instruct-AWQ

Text Generation • 31B • Updated Oct 8 • 127k • 33

Disty0/FLUX.1-dev-SDNQ-uint4-svd-r32

Text-to-Image • Updated 3 days ago • 216 • 2

unsloth/Qwen3-VL-8B-Instruct-unsloth-bnb-4bit

Image-Text-to-Text • 9B • Updated Oct 31 • 82.6k • 12

samdotci/Qwen3-Reranker-0.6B-mlx-4Bit

Text Ranking • 93.1M • Updated Oct 21 • 79 • 1

Disty0/Qwen-Image-Edit-SDNQ-uint4-svd-r32

Image-to-Image • Updated Nov 5 • 87 • 2

QuantTrio/MiniMax-M2-AWQ

Text Generation • 229B • Updated 4 days ago • 8.5k • 6

unsloth/granite-4.0-h-350m-bnb-4bit

Text Generation • 0.3B • Updated Oct 28 • 104 • 1

ModelCloud/MiniMax-M2-GPTQMODEL-W4A16

Text Generation • 229B • Updated Oct 28 • 629 • 3

sainikhiljuluri/foundation-sec-8b-cve-cybersecurity

Text Generation • 8B • Updated Nov 4 • 15 • 1

mlx-community/Jan-v2-VL-high-4bit-mlx

Updated 22 days ago • 71 • 1

QuantTrio/MiniMax-M2-REAP-162B-A10B-AWQ

Text Generation • 162B • Updated 4 days ago • 165 • 2

Datangtang/lora_lab2_model_1B

1B • Updated 13 days ago • 1

Disty0/FLUX.2-dev-SDNQ-uint4-svd-r32

Updated 3 days ago • 561 • 9

mlx-community/LLaDA2.0-mini-4bit

Text Generation • 16B • Updated 12 days ago • 82 • 1

kofdai/nullai-deepseek-r1-32b

Text Generation • 33B • Updated 12 days ago • 496 • 2

fabricioalmeida/BumbaLM-3B-Instruct-v0.1

Text Generation • 3B • Updated 11 days ago • 22 • 1