iQ quants

#1
by natalie5 - opened

iQ quants will be much more accurate below 4 bits, if possible please include the iQ quants too πŸ₯°

How are they supposed to calibrate it to include imatrix data Β―_(ツ)_/Β―

How are they supposed to calibrate it to include imatrix data Β―_(ツ)_/Β―

tbh I have no idea https://huggingface.co/calcuis/hunyuanimage-gguf/tree/main this guy has iq quants for every model

I am sure they do not contain imatrix data, it is allowed to make fake IQ quants by using no calibration datasets. Real IQ quants get calibrated, but it's more like for LLMs rather than image and video models, llama.cpp can't convert such models natively and we've been using hacky patches

I am sure they do not contain imatrix data, it is allowed to make fake IQ quants by using no calibration datasets. Real IQ quants get calibrated, but it's more like for LLMs rather than image and video models, llama.cpp can't convert such models natively and we've been using hacky patches

Oh, I am not sure how he does it, but if what you are saying is true then no point in iQ quants for diffusion models I guess. I think https://huggingface.co/unsloth/LTX-2-GGUF is the best GGUF version available for now.

Unsloth Dynamic 2.0 is a kind of imatrix too isn't πŸ€” claimed to be improvement from standard iMatrix.

if what you are saying is true

For imatrix calibration of a LLM just a random bunch of text is needed, but diffusion models would need another different calibration approach because they're different

...then no point in iQ quants for diffusion models

No, they may be useful, NL, XS and XXS quants are smaller than S

Sign up or log in to comment