OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 16 days ago • 320
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models Paper • 2602.04649 • Published 17 days ago • 12
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 12 days ago • 260
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance Paper • 2506.06444 • Published Jun 6, 2025 • 73