FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning Paper • 2510.22543 • Published Oct 26 • 10
cerebras/Kimi-Linear-REAP-35B-A3B-Instruct Text Generation • 35B • Updated about 1 month ago • 6.35k • 51