keypa
/

INTELLECT-3-FP8-gguf

text-generation-inference

Model card Files Files and versions

INTELLECT-3-FP8 - GGUF

This is a GGUF conversion of PrimeIntellect/INTELLECT-3-FP8.

Conversion Info

Precision: F16 (Half Precision)
Tool: llama.cpp convert-hf-to-gguf.py

Usage

Download and use with llama.cpp or any GGUF-compatible inference engine.

Downloads last month: 24

GGUF

Model size

107B params

Architecture

glm4moe

Hardware compatibility

Log In to view the estimation

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for keypa/INTELLECT-3-FP8-gguf

Base model

zai-org/GLM-4.5-Air-Base

Quantized

PrimeIntellect/INTELLECT-3-FP8

Quantized

(1)

this model