2.4-2.5 bpw request?

by Michund10s9 - opened 7 days ago

7 days ago

Heya, any chance we can get a 2.4/2.5 quant? That places the model just beyond the massive inflection point of KL divergence (at least for devstral https://huggingface.co/turboderp/Devstral-2-123B-Instruct-2512-exl3)

and also around the perfect amount for 80GB VRAM

Owner 7 days ago

I'm going to do some optimized versions shortly.

Owner 6 days ago

Optimized quants have been added. Let me know if there are any other sizes you'd like to see.

5 days ago

Thanks for putting this together, @MikeRoz :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment