A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos Paper • 2512.16978 • Published 16 days ago • 4
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 259
VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos Paper • 2506.05349 • Published Jun 5, 2025 • 24
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published Mar 6, 2025 • 72