OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Paper • 2512.08294 • Published Dec 9, 2025 • 18
AlignVid: Training-Free Attention Scaling for Semantic Fidelity in Text-Guided Image-to-Video Generation Paper • 2512.01334 • Published Dec 1, 2025
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Paper • 2512.08294 • Published Dec 9, 2025 • 18
Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 109
Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 109
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published Oct 16, 2025 • 67