W4A16 quant
#1
by
timroethig
- opened
Thanks for providing these quants, are you by chance also working on a W4A16 quant? Could make a lot of sense for a sparse MoE model like this no?
yes, other quant schemes (int4 and fp4) are coming very soon
thanks for your work.
hopefully i can run it on two h100 :)
thanks for your work.
hopefully i can run it on two h100 :)
maybe you need gguf q3 or autoround w2a16🥲