TimeBill: Time-Budgeted Inference for Large Language Models Paper • 2512.21859 • Published 4 days ago • 12
See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning Paper • 2512.22120 • Published 3 days ago • 11
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published 11 days ago • 81
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search Paper • 2512.18745 • Published 9 days ago • 9
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion Paper • 2507.02813 • Published Jul 3 • 60