Thinking with Programming Vision: Towards a Unified View for Thinking with Images Paper • 2512.03746 • Published 4 days ago • 15
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 5 days ago • 167
Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning Paper • 2511.20549 • Published 11 days ago • 23
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published 13 days ago • 239
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 9 days ago • 145
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published 17 days ago • 104
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published 17 days ago • 42
VideoSSR: Video Self-Supervised Reinforcement Learning Paper • 2511.06281 • Published 28 days ago • 24
VideoSSR: Video Self-Supervised Reinforcement Learning Paper • 2511.06281 • Published 28 days ago • 24 • 2
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published about 1 month ago • 208
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation Paper • 2510.18701 • Published Oct 21 • 66
VPPO Model Collection SOTA models for multimodal reasoning, fine-tuned with VPPO. Achieves superior performance by focusing on critical visual tokens. • 4 items • Updated 30 days ago • 4
Spotlight on Token Perception for Multimodal Reinforcement Learning Paper • 2510.09285 • Published Oct 10 • 36 • 3
Spotlight on Token Perception for Multimodal Reinforcement Learning Paper • 2510.09285 • Published Oct 10 • 36 • 3
Spotlight on Token Perception for Multimodal Reinforcement Learning Paper • 2510.09285 • Published Oct 10 • 36