Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 24 days ago • 92
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Paper • 2511.08633 • Published 28 days ago • 53
CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards Paper • 2510.08529 • Published Oct 9 • 18
DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder Paper • 2509.25182 • Published Sep 29 • 37
Sana Collection ⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated Sep 13 • 97
FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games Paper • 2509.01052 • Published Sep 1 • 20
ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding Paper • 2508.21496 • Published Aug 29 • 54
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2 • 225
FastMesh:Efficient Artistic Mesh Generation via Component Decoupling Paper • 2508.19188 • Published Aug 26 • 17
CineScale: Free Lunch in High-Resolution Cinematic Visual Generation Paper • 2508.15774 • Published Aug 21 • 20
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation Paper • 2508.17472 • Published Aug 24 • 26
TexVerse: A Universe of 3D Objects with High-Resolution Textures Paper • 2508.10868 • Published Aug 14 • 17
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off Paper • 2508.04825 • Published Aug 6 • 58
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving Paper • 2507.23726 • Published Jul 31 • 114