hecui102 commited on
Commit
05774c0
·
verified ·
1 Parent(s): 97e5f4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -1,6 +1,16 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
4
  In this work, we present **AMD Hummingbird-XT**, an efficient **DiT-based** video generative model designed for high-quality video generation on client-grade GPUs with **5B parameters** .
5
 
6
  Hummingbird-XT is trained based on Wan2.2-5B-TI2V using **DMD step distillation** with carefully designed **data curation**, enabling **3-step generation** while preserving high visual fidelity and motion quality. To reduce the computational overhead of high-resolution video decoding in 3D convolution–based VAE decoders, we introduce a **lightweight and efficient VAE decoder** by replacing part of the 3D convolutions with depthwise separable convolutions. Additionally, to further extend the length of generated videos, we introduce **Hummingbird-XTX**, an efficient **autoregressive model** for **long-video generation** based on Wan-2.1-1.3B, which is capable of generating long videos.
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ <p align="center"><h1 align="center">
5
+ Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms
6
+ </h1>
7
+ </p>
8
+
9
+ <p align="center">
10
+ <h3 align="center"><a href="https://rocm.blogs.amd.com/artificial-intelligence/hummingbirdxt/README.html">Blog</a> | <a href="https://github.com/AMD-AGI/HummingbirdXT">Code</a></h3>
11
+ </p>
12
+
13
+
14
  In this work, we present **AMD Hummingbird-XT**, an efficient **DiT-based** video generative model designed for high-quality video generation on client-grade GPUs with **5B parameters** .
15
 
16
  Hummingbird-XT is trained based on Wan2.2-5B-TI2V using **DMD step distillation** with carefully designed **data curation**, enabling **3-step generation** while preserving high visual fidelity and motion quality. To reduce the computational overhead of high-resolution video decoding in 3D convolution–based VAE decoders, we introduce a **lightweight and efficient VAE decoder** by replacing part of the 3D convolutions with depthwise separable convolutions. Additionally, to further extend the length of generated videos, we introduce **Hummingbird-XTX**, an efficient **autoregressive model** for **long-video generation** based on Wan-2.1-1.3B, which is capable of generating long videos.