-
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 187 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 94 -
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 39
wenzel zhang
wenzel94
·
AI & ML interests
None yet
Recent Activity
updated
a collection
1 day ago
LLM RL
updated
a collection
2 days ago
LLM RL
upvoted
a
paper
2 days ago
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration