LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published 13 days ago • 148
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published 13 days ago • 31
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published 18 days ago • 91
EditReward Collection EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing • 11 items • Updated Oct 13 • 4
A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports Paper • 2510.02190 • Published Oct 2 • 18
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing Paper • 2509.26346 • Published Sep 30 • 18
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26 • 25
PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models Paper • 2505.22523 • Published May 28 • 7
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper • 2503.20672 • Published Mar 26 • 14