IQ2_KS works

by coughmedicine - opened 8 days ago

Discussion

coughmedicine

8 days ago

but IMHO not as good as IQ1_KT of GLM 4.6 or 4.7.

ubergarm

Owner 7 days ago

Interesting, I'd imagine for agentic use and tool calls that maybe MiniMax-M2.5 would be better possibly?

Though M2.5 does not have any shexp nor dense layers - it is only attn and routed exps so with less active parameters and the percentage of active parameters being more quantized now it could be M2.5 doesn't handle heavy quantization as well as GLM's.

And I guess with GLM-4.7 smol-IQ1_KT 82.442 GiB (1.976 BPW) you can probably fit in more context because it is MLA so quite efficient.

If only GLM-5 were not so chonky! oof...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment