OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing Paper • 2509.24900 • Published Sep 29 • 53
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning Paper • 2509.23873 • Published Sep 28 • 67
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7 • 53
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8 • 72
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published Oct 3 • 97
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense Paper • 2510.07242 • Published Oct 8 • 30
From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning Paper • 2509.23768 • Published Sep 28 • 48
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9 • 109
DivMerge: A divergence-based model merging method for multi-tasking Paper • 2509.02108 • Published Sep 2 • 25
Reconstruction Alignment Improves Unified Multimodal Models Paper • 2509.07295 • Published Sep 8 • 40
Visual Representation Alignment for Multimodal Large Language Models Paper • 2509.07979 • Published Sep 9 • 83
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search Paper • 2509.07969 • Published Sep 9 • 59
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9 • 101
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2 • 83