W4A16 quant

#1
by timroethig - opened

Thanks for providing these quants, are you by chance also working on a W4A16 quant? Could make a lot of sense for a sparse MoE model like this no?

Red Hat AI org

yes, other quant schemes (int4 and fp4) are coming very soon

thanks for your work.
hopefully i can run it on two h100 :)

thanks for your work.
hopefully i can run it on two h100 :)

maybe you need gguf q3 or autoround w2a16🥲

Sign up or log in to comment