Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper β’ 2511.14993 β’ Published 19 days ago β’ 222
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper β’ 2511.10629 β’ Published 24 days ago β’ 122
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper β’ 2511.09057 β’ Published 26 days ago β’ 75
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper β’ 2508.10711 β’ Published Aug 14 β’ 144
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper β’ 2507.14683 β’ Published Jul 19 β’ 134
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 β’ 11 items β’ Updated Jul 21 β’ 549
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper β’ 2506.18898 β’ Published Jun 23 β’ 33
π June 2025 - Open works from the Chinese community Collection 29 items β’ Updated 17 days ago β’ 7
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper β’ 2506.09350 β’ Published Jun 11 β’ 48
SeedVR Collection A diffusion transformer model for high-resolution image and video restoration. β’ 9 items β’ Updated Aug 19 β’ 9
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training Paper β’ 2506.05301 β’ Published Jun 5 β’ 56
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper β’ 2505.10554 β’ Published May 15 β’ 120
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper β’ 2504.08685 β’ Published Apr 11 β’ 130
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation Paper β’ 2502.05179 β’ Published Feb 7 β’ 24
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper β’ 2501.01320 β’ Published Jan 2 β’ 12