-
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Paper • 2505.14604 • Published • 23 -
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
Paper • 2505.16944 • Published • 8 -
Training Step-Level Reasoning Verifiers with Formal Verification Tools
Paper • 2505.15960 • Published • 7 -
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Paper • 2505.15134 • Published • 6
Collections
Discover the best community collections!
Collections including paper arxiv:2508.07407
-
Traffic Light Control with Reinforcement Learning
Paper • 2308.14295 • Published -
Code2Video: A Code-centric Paradigm for Educational Video Generation
Paper • 2510.01174 • Published • 33 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 626 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 301 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 313 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 210
-
SSRL: Self-Search Reinforcement Learning
Paper • 2508.10874 • Published • 97 -
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
Paper • 2508.11987 • Published • 71 -
Deep Think with Confidence
Paper • 2508.15260 • Published • 88 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 80 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 7
-
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Paper • 2504.20073 • Published • 12 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158
-
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal
Paper • 2508.05988 • Published • 19 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29
-
aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists
Paper • 2508.15126 • Published • 20 -
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
Paper • 2508.14111 • Published • 33 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98
-
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Paper • 2505.14604 • Published • 23 -
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
Paper • 2505.16944 • Published • 8 -
Training Step-Level Reasoning Verifiers with Formal Verification Tools
Paper • 2505.15960 • Published • 7 -
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Paper • 2505.15134 • Published • 6
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 80 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 7
-
Traffic Light Control with Reinforcement Learning
Paper • 2308.14295 • Published -
Code2Video: A Code-centric Paradigm for Educational Video Generation
Paper • 2510.01174 • Published • 33 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98
-
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Paper • 2504.20073 • Published • 12 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 626 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 301 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 313 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 210
-
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal
Paper • 2508.05988 • Published • 19 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29
-
aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists
Paper • 2508.15126 • Published • 20 -
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
Paper • 2508.14111 • Published • 33 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98
-
SSRL: Self-Search Reinforcement Learning
Paper • 2508.10874 • Published • 97 -
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
Paper • 2508.11987 • Published • 71 -
Deep Think with Confidence
Paper • 2508.15260 • Published • 88 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98