Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

1,015

Full-text search

Active filters: fp8

nerdylive/Meta-Llama-3-8B-Instruct-FP8

Text Generation • 8B • Updated Jul 1, 2024 • 7

FlorianJc/google-gemma-2-9b-it-vllm-fp8

Text Generation • 9B • Updated Jul 17, 2024 • 12 • 1

tranhoangnguyen03/Gemma-2-9B-It-SPPO-Iter3_Q8

Text Generation • 9B • Updated Jul 7, 2024 • 5

Model-SafeTensors/mistralai-Mixtral-8x7B-v0.1

Text Generation • 47B • Updated Jul 7, 2024 • 4

Model-SafeTensors/mistralai-Mixtral-8x22B

Text Generation • 141B • Updated Jul 7, 2024 • 5

FlorianJc/Llama3-ChatQA-1.5-8B-v2-vllm-fp8

Text Generation • 8B • Updated Jul 17, 2024 • 5

FlorianJc/MegaBeam-Mistral-7B-300k-vllm-fp8

Text Generation • 7B • Updated Jul 17, 2024 • 9

RedHatAI/gemma-2-9b-it-FP8

Text Generation • 9B • Updated Sep 22 • 952 • 5

darthhexx/Phi-3-medium-128k-instruct-fp8

Text Generation • 14B • Updated Jul 22, 2024 • 20

muhtasham/TowerInstruct-7B-v0.1-FP8

Text Generation • 7B • Updated Aug 27, 2024 • 7

raja-nectar/Lumimaid-70B-fp8

Text Generation • 71B • Updated Jul 11, 2024 • 4

raja-nectar/Lumimaid-70B-FP8-OAS

Text Generation • 71B • Updated Jul 12, 2024 • 4

Ksgk-fy/maria-v2-fp8-dynamic

Text Generation • 8B • Updated Jul 12, 2024 • 1

Ksgk-fy/maria-v2-fp8-static

Text Generation • 8B • Updated Jul 12, 2024 • 4

Ksgk-fy/maria_v113-fp8-dynamic

Text Generation • 8B • Updated Jul 13, 2024 • 2

Ksgk-fy/maria_v114-fp8-dynamic

Text Generation • 8B • Updated Jul 13, 2024 • 4

Ksgk-fy/maria_v115-fp8-dynamic

Text Generation • 8B • Updated Jul 14, 2024 • 4

RedHatAI/Qwen2-57B-A14B-Instruct-FP8

Text Generation • 57B • Updated Jul 18, 2024 • 517 • 1

nm-testing/Meta-Llama-3-8B-Instruct-FP8-K-V

Text Generation • 8B • Updated Oct 9, 2024 • 7

RedHatAI/DeepSeek-Coder-V2-Lite-Instruct-FP8

Text Generation • 16B • Updated Jul 18, 2024 • 22.9k • 8

RedHatAI/DeepSeek-Coder-V2-Lite-Base-FP8

Text Generation • 16B • Updated Jul 18, 2024 • 21

Rallio67/llama3-70b-exab-fp8

Text Generation • Updated Jul 18, 2024 • 9

mgoin/Mistral-Nemo-Instruct-2407-FP8-Dynamic

Text Generation • 12B • Updated Jul 18, 2024 • 79

mgoin/Mistral-Nemo-Instruct-2407-FP8-KV

Text Generation • 12B • Updated Jul 18, 2024 • 5

RedHatAI/Mistral-Nemo-Instruct-2407-FP8

Text Generation • 12B • Updated Jul 19, 2024 • 3.54k • 18

obamaTeo/llama-finetune-8bit-wiki-252-ver2

Text Generation • 8B • Updated Jul 18, 2024 • 3

FlorianJc/Mistral-Nemo-Instruct-2407-vllm-fp8

Text Generation • 12B • Updated Jul 31, 2024 • 28 • 8

mgoin/Nemotron-4-340B-Instruct-FP8-Dynamic

Text Generation • 341B • Updated Jul 23, 2024 • 6

RedHatAI/DeepSeek-Coder-V2-Base-FP8

Text Generation • 236B • Updated Jul 22, 2024 • 22

RedHatAI/DeepSeek-Coder-V2-Instruct-FP8

Text Generation • 236B • Updated Jul 22, 2024 • 26 • 7