ISTA-DASLab/Kimi-K2-Thinking-GPTQ-2b-32g-experts
170B
•
Updated
•
55
None defined yet.
WUSH: Near-Optimal Adaptive Transforms for LLM Quantization
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training