John Pope's picture

John Pope

johndpope

·

johndpope

AI & ML interests

None yet

Recent Activity

liked a Space 1 day ago

dream2589632147/Dream-wan2-2-faster-Pro

liked a Space 2 days ago

Tongyi-MAI/Z-Image-Turbo

upvoted a paper 2 days ago

Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

View all activity

Organizations

upvoted 2 papers 2 days ago

Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published 9 days ago • 18

RELIC: Interactive Video World Model with Long-Horizon Memory

Paper • 2512.04040 • Published 3 days ago • 20

upvoted a paper 3 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published 9 days ago • 145

upvoted a paper 4 days ago

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow

Paper • 2511.20462 • Published 11 days ago • 29

upvoted 3 papers 16 days ago

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published 24 days ago • 68

PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image

Paper • 2511.13648 • Published 19 days ago • 52

Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published 19 days ago • 63

upvoted a paper 20 days ago

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published 23 days ago • 92

upvoted a collection 25 days ago

Vision Language Models: 2025 Update

This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update • 67 items • Updated May 12 • 5

upvoted 7 papers about 1 month ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491

MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency

Paper • 2510.25897 • Published Oct 29 • 16

TradingAgents: Multi-Agents LLM Financial Trading Framework

Paper • 2412.20138 • Published Dec 28, 2024 • 14

OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

Paper • 2410.17799 • Published Oct 23, 2024 • 5

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Paper • 2510.19808 • Published Oct 22 • 28

RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer

Paper • 2508.05115 • Published Aug 7 • 3

Wan-S2V: Audio-Driven Cinematic Video Generation

Paper • 2508.18621 • Published Aug 26 • 19

upvoted a paper about 2 months ago

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 23

upvoted 2 papers 2 months ago

Dawn of the transformer era in speech emotion recognition: closing the valence gap

Paper • 2203.07378 • Published Mar 14, 2022 • 2

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Paper • 2509.19296 • Published Sep 23 • 23

upvoted a paper 3 months ago

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Paper • 2509.08519 • Published Sep 10 • 128