2.4-2.5 bpw request?

#1
by Michund10s9 - opened

Heya, any chance we can get a 2.4/2.5 quant? That places the model just beyond the massive inflection point of KL divergence (at least for devstral https://huggingface.co/turboderp/Devstral-2-123B-Instruct-2512-exl3)
image
and also around the perfect amount for 80GB VRAM

I'm going to do some optimized versions shortly.

Optimized quants have been added. Let me know if there are any other sizes you'd like to see.

Thanks for putting this together, @MikeRoz :)

Sign up or log in to comment