Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6 • 491
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents Paper • 2509.09265 • Published Sep 11 • 46
OpenMath Collection A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated 3 days ago • 45
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 248