PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning Paper โข 2601.05593 โข Published 16 days ago โข 79
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper โข 2601.05242 โข Published 17 days ago โข 205
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper โข 2512.15745 โข Published Dec 10, 2025 โข 80
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction Paper โข 2512.04987 โข Published Dec 4, 2025 โข 80
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper โข 2512.02556 โข Published Dec 2, 2025 โข 253