Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models Paper • 2601.08955 • Published 8 days ago • 12
Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments Paper • 2601.01075 • Published 19 days ago • 5
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following Paper • 2601.06431 • Published 12 days ago • 10
ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback Paper • 2601.10156 • Published 6 days ago • 22
Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published 7 days ago • 31
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published 6 days ago • 25
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation Paper • 2601.10061 • Published 7 days ago • 29
DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset Paper • 2601.10305 • Published 6 days ago • 35
Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning Paper • 2601.07641 • Published 9 days ago • 43
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published 6 days ago • 152
PhyRPR: Training-Free Physics-Constrained Video Generation Paper • 2601.09255 • Published 7 days ago • 2
AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts Paper • 2601.11044 • Published 5 days ago • 30
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models Paper • 2601.11404 • Published 5 days ago • 23
Future Optical Flow Prediction Improves Robot Control & Video Generation Paper • 2601.10781 • Published 6 days ago • 16