update README
Browse files- README.md +50 -14
- README_CN.md +46 -13
README.md
CHANGED
|
@@ -40,10 +40,10 @@ HunyuanVideo-1.5 is a video generation model that delivers top-tier quality with
|
|
| 40 |
<a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
|
| 41 |
<a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
|
| 42 |
<a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
|
| 43 |
-
<a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/
|
| 44 |
<a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
|
| 45 |
<a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
|
| 46 |
-
<a href="
|
| 47 |
<a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
|
| 48 |
|
| 49 |
</div>
|
|
@@ -67,7 +67,9 @@ HunyuanVideo-1.5 is a video generation model that delivers top-tier quality with
|
|
| 67 |
|
| 68 |
If you develop/use HunyuanVideo-1.5 in your projects, welcome to let us know.
|
| 69 |
|
| 70 |
-
- **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): A powerful and modular diffusion model GUI with a graph/nodes interface. ComfyUI supports HunyuanVideo-1.5 with various engineering optimizations for fast inference.
|
|
|
|
|
|
|
| 71 |
|
| 72 |
- **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): A lightweight and efficient video generation framework that integrates HunyuanVideo-1.5, supporting multiple engineering acceleration techniques for fast inference.
|
| 73 |
|
|
@@ -95,6 +97,7 @@ If you develop/use HunyuanVideo-1.5 in your projects, welcome to let us know.
|
|
| 95 |
- [Text to Video](#text-to-video)
|
| 96 |
- [Image to Video](#image-to-video)
|
| 97 |
- [Command Line Arguments](#command-line-arguments)
|
|
|
|
| 98 |
- [🧱 Models Cards](#-models-cards)
|
| 99 |
- [🎬 More Examples](#-more-examples)
|
| 100 |
- [📊 Evaluation](#-evaluation)
|
|
@@ -157,8 +160,8 @@ pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-s
|
|
| 157 |
### Step 3: Install Attention Libraries
|
| 158 |
|
| 159 |
* Flash Attention:
|
| 160 |
-
|
| 161 |
-
Detailed installation instructions are available at [Flash Attention](https://github.com/Dao-AILab/flash-attention).
|
| 162 |
|
| 163 |
* Flex-Block-Attention:
|
| 164 |
flex-block-attn is only required for sparse attention to achieve faster inference and can be installed by the following command:
|
|
@@ -169,6 +172,8 @@ Detailed installation instructions are available at [Flash Attention](https://gi
|
|
| 169 |
```
|
| 170 |
|
| 171 |
* SageAttention:
|
|
|
|
|
|
|
| 172 |
```bash
|
| 173 |
git clone https://github.com/cooper1637/SageAttention.git
|
| 174 |
cd SageAttention
|
|
@@ -223,10 +228,11 @@ OUTPUT_PATH=./outputs/output.mp4
|
|
| 223 |
# Configuration
|
| 224 |
N_INFERENCE_GPU=8 # Parallel inference GPU count
|
| 225 |
CFG_DISTILLED=true # Inference with CFG distilled model, 2x speedup
|
| 226 |
-
SPARSE_ATTN=
|
| 227 |
SAGE_ATTN=false # Inference with SageAttention
|
| 228 |
MODEL_PATH=ckpts # Path to pretrained model
|
| 229 |
REWRITE=true # Enable prompt rewriting
|
|
|
|
| 230 |
|
| 231 |
torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
| 232 |
--prompt "$PROMPT" \
|
|
@@ -239,10 +245,18 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
|
| 239 |
--use_sageattn $SAGE_ATTN \
|
| 240 |
--rewrite $REWRITE \
|
| 241 |
--output_path $OUTPUT_PATH \
|
|
|
|
| 242 |
--save_pre_sr_video \
|
| 243 |
--model_path $MODEL_PATH
|
| 244 |
```
|
| 245 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 246 |
### Command Line Arguments
|
| 247 |
|
| 248 |
| Argument | Type | Required | Default | Description |
|
|
@@ -264,6 +278,7 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
|
| 264 |
| `--sparse_attn` | bool | No | `false` | Enable sparse attention for faster inference (~1.5-2x speedup, requires H-series GPUs, auto-enables CFG distilled, use `--sparse_attn` or `--sparse_attn true` to enable) |
|
| 265 |
| `--offloading` | bool | No | `true` | Enable CPU offloading (use `--offloading false` or `--offloading 0` to disable for faster inference if GPU memory allows) |
|
| 266 |
| `--group_offloading` | bool | No | `None` | Enable group offloading (default: None, automatically enabled if offloading is enabled. Use `--group_offloading` or `--group_offloading true/1` to enable, `--group_offloading false/0` to disable) |
|
|
|
|
| 267 |
| `--dtype` | str | No | `bf16` | Data type for transformer: `bf16` (faster, lower memory) or `fp32` (better quality, slower, higher memory) |
|
| 268 |
| `--use_sageattn` | bool | No | `false` | Enable SageAttention (use `--use_sageattn` or `--use_sageattn true/1` to enable, `--use_sageattn false/0` to disable) |
|
| 269 |
| `--sage_blocks_range` | str | No | `0-53` | SageAttention blocks range (e.g., `0-5` or `0,1,2,3,4,5`) |
|
|
@@ -271,22 +286,43 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
|
| 271 |
|
| 272 |
**Note:** Use `--nproc_per_node` to specify the number of GPUs. For example, `--nproc_per_node=8` uses 8 GPUs.
|
| 273 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 274 |
|
| 275 |
## 🧱 Models Cards
|
| 276 |
|ModelName| Download |
|
| 277 |
|-|---------------------------|
|
| 278 |
|HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
|
| 279 |
|HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
|
| 280 |
-
|HunyuanVideo-1.5-480P-T2V-distill | [480P-T2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
|
| 281 |
-
|HunyuanVideo-1.5-480P-I2V-distill |[480P-I2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
|
| 282 |
|HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
|
| 283 |
|HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
|
| 284 |
-
|HunyuanVideo-1.5-720P-T2V-
|
| 285 |
-
|HunyuanVideo-1.5-720P-I2V-
|
| 286 |
-
|HunyuanVideo-1.5-720P-T2V-sparse-
|
| 287 |
-
|HunyuanVideo-1.5-720P-I2V-sparse-
|
| 288 |
-
|HunyuanVideo-1.5-720P-sr |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
|
| 289 |
-
|HunyuanVideo-1.5-1080P-sr |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
|
| 290 |
|
| 291 |
|
| 292 |
|
|
|
|
| 40 |
<a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
|
| 41 |
<a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
|
| 42 |
<a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
|
| 43 |
+
<a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/report/HunyuanVideo_1_5.pdf" target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
|
| 44 |
<a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
|
| 45 |
<a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
|
| 46 |
+
<a href="./ComfyUI/README.md" target="_blank"><img src=https://img.shields.io/badge/ComfyUI-blue.svg?logo=book height=22px></a>
|
| 47 |
<a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
|
| 48 |
|
| 49 |
</div>
|
|
|
|
| 67 |
|
| 68 |
If you develop/use HunyuanVideo-1.5 in your projects, welcome to let us know.
|
| 69 |
|
| 70 |
+
- **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): A powerful and modular diffusion model GUI with a graph/nodes interface. ComfyUI supports HunyuanVideo-1.5 with various engineering optimizations for fast inference. We provide a [ComfyUI Usage Guide](./ComfyUI/README.md) for HunyuanVideo-1.5.
|
| 71 |
+
|
| 72 |
+
- **Community-implemented ComfyUI Plugin** - [comfyui_hunyuanvideo_1.5_plugin](https://github.com/yuanyuan-spec/comfyui_hunyuanvideo_1.5_plugin): A community-implemented ComfyUI plugin for HunyuanVideo-1.5, offering both simplified and complete node sets for quick usage or deep workflow customization, with built-in automatic model download support.
|
| 73 |
|
| 74 |
- **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): A lightweight and efficient video generation framework that integrates HunyuanVideo-1.5, supporting multiple engineering acceleration techniques for fast inference.
|
| 75 |
|
|
|
|
| 97 |
- [Text to Video](#text-to-video)
|
| 98 |
- [Image to Video](#image-to-video)
|
| 99 |
- [Command Line Arguments](#command-line-arguments)
|
| 100 |
+
- [Optimal Inference Configurations](#optimal-inference-configurations)
|
| 101 |
- [🧱 Models Cards](#-models-cards)
|
| 102 |
- [🎬 More Examples](#-more-examples)
|
| 103 |
- [📊 Evaluation](#-evaluation)
|
|
|
|
| 160 |
### Step 3: Install Attention Libraries
|
| 161 |
|
| 162 |
* Flash Attention:
|
| 163 |
+
Install Flash Attention for faster inference and reduced GPU memory consumption.
|
| 164 |
+
Detailed installation instructions are available at [Flash Attention](https://github.com/Dao-AILab/flash-attention).
|
| 165 |
|
| 166 |
* Flex-Block-Attention:
|
| 167 |
flex-block-attn is only required for sparse attention to achieve faster inference and can be installed by the following command:
|
|
|
|
| 172 |
```
|
| 173 |
|
| 174 |
* SageAttention:
|
| 175 |
+
To enable SageAttention for faster inference, you need to install it by the following command:
|
| 176 |
+
> **Note**: Enabling SageAttention will automatically disable Flex-Block-Attention.
|
| 177 |
```bash
|
| 178 |
git clone https://github.com/cooper1637/SageAttention.git
|
| 179 |
cd SageAttention
|
|
|
|
| 228 |
# Configuration
|
| 229 |
N_INFERENCE_GPU=8 # Parallel inference GPU count
|
| 230 |
CFG_DISTILLED=true # Inference with CFG distilled model, 2x speedup
|
| 231 |
+
SPARSE_ATTN=false # Inference with sparse attention
|
| 232 |
SAGE_ATTN=false # Inference with SageAttention
|
| 233 |
MODEL_PATH=ckpts # Path to pretrained model
|
| 234 |
REWRITE=true # Enable prompt rewriting
|
| 235 |
+
OVERLAP_GROUP_OFFLOADING=true # Only valid when group offloading is enabled, significantly increases CPU memory usage but speeds up inference
|
| 236 |
|
| 237 |
torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
| 238 |
--prompt "$PROMPT" \
|
|
|
|
| 245 |
--use_sageattn $SAGE_ATTN \
|
| 246 |
--rewrite $REWRITE \
|
| 247 |
--output_path $OUTPUT_PATH \
|
| 248 |
+
--overlap_group_offloading $OVERLAP_GROUP_OFFLOADING \
|
| 249 |
--save_pre_sr_video \
|
| 250 |
--model_path $MODEL_PATH
|
| 251 |
```
|
| 252 |
|
| 253 |
+
> **Tips:** If your GPU memory is > 14GB but you encounter OOM (Out of Memory) errors during generation, you can try setting the following environment variable before running:
|
| 254 |
+
> ```bash
|
| 255 |
+
> export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True,max_split_size_mb:128
|
| 256 |
+
> ```
|
| 257 |
+
|
| 258 |
+
|
| 259 |
+
|
| 260 |
### Command Line Arguments
|
| 261 |
|
| 262 |
| Argument | Type | Required | Default | Description |
|
|
|
|
| 278 |
| `--sparse_attn` | bool | No | `false` | Enable sparse attention for faster inference (~1.5-2x speedup, requires H-series GPUs, auto-enables CFG distilled, use `--sparse_attn` or `--sparse_attn true` to enable) |
|
| 279 |
| `--offloading` | bool | No | `true` | Enable CPU offloading (use `--offloading false` or `--offloading 0` to disable for faster inference if GPU memory allows) |
|
| 280 |
| `--group_offloading` | bool | No | `None` | Enable group offloading (default: None, automatically enabled if offloading is enabled. Use `--group_offloading` or `--group_offloading true/1` to enable, `--group_offloading false/0` to disable) |
|
| 281 |
+
| `--overlap_group_offloading` | bool | No | `true` | Enable overlap group offloading (default: true). Significantly increases CPU memory usage but speeds up inference. Use `--overlap_group_offloading` or `--overlap_group_offloading true/1` to enable, `--overlap_group_offloading false/0` to disable |
|
| 282 |
| `--dtype` | str | No | `bf16` | Data type for transformer: `bf16` (faster, lower memory) or `fp32` (better quality, slower, higher memory) |
|
| 283 |
| `--use_sageattn` | bool | No | `false` | Enable SageAttention (use `--use_sageattn` or `--use_sageattn true/1` to enable, `--use_sageattn false/0` to disable) |
|
| 284 |
| `--sage_blocks_range` | str | No | `0-53` | SageAttention blocks range (e.g., `0-5` or `0,1,2,3,4,5`) |
|
|
|
|
| 286 |
|
| 287 |
**Note:** Use `--nproc_per_node` to specify the number of GPUs. For example, `--nproc_per_node=8` uses 8 GPUs.
|
| 288 |
|
| 289 |
+
### Optimal Inference Configurations
|
| 290 |
+
|
| 291 |
+
The following table provides the optimal inference configurations (CFG scale, embedded CFG scale, flow shift, and inference steps) for each model to achieve the best generation quality:
|
| 292 |
+
|
| 293 |
+
| Model | CFG Scale | Embedded CFG Scale | Flow Shift | Inference Steps |
|
| 294 |
+
|-------|-----------|-------------------|------------|-----------------|
|
| 295 |
+
| 480p T2V | 6 | None | 5 | 50 |
|
| 296 |
+
| 480p I2V | 6 | None | 5 | 50 |
|
| 297 |
+
| 720p T2V | 6 | None | 9 | 50 |
|
| 298 |
+
| 720p I2V | 6 | None | 7 | 50 |
|
| 299 |
+
| 480p T2V CFG Distilled | 1 | None | 5 | 50 |
|
| 300 |
+
| 480p I2V CFG Distilled | 1 | None | 5 | 50 |
|
| 301 |
+
| 720p T2V CFG Distilled | 1 | None | 9 | 50 |
|
| 302 |
+
| 720p I2V CFG Distilled | 1 | None | 7 | 50 |
|
| 303 |
+
| 720p T2V CFG Distilled Sparse | 1 | None | 7 | 50 |
|
| 304 |
+
| 720p I2V CFG Distilled Sparse | 1 | None | 9 | 50 |
|
| 305 |
+
| 480→720 SR Step Distilled | 1 | None | 2 | 6 |
|
| 306 |
+
| 720→1080 SR Step Distilled | 1 | None | 2 | 8 |
|
| 307 |
+
|
| 308 |
+
**Please note that the cfg distilled model we provided, must use 50 steps to generate correct results.**
|
| 309 |
+
|
| 310 |
|
| 311 |
## 🧱 Models Cards
|
| 312 |
|ModelName| Download |
|
| 313 |
|-|---------------------------|
|
| 314 |
|HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
|
| 315 |
|HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
|
| 316 |
+
|HunyuanVideo-1.5-480P-T2V-cfg-distill | [480P-T2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
|
| 317 |
+
|HunyuanVideo-1.5-480P-I2V-cfg-distill |[480P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
|
| 318 |
|HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
|
| 319 |
|HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
|
| 320 |
+
|HunyuanVideo-1.5-720P-T2V-cfg-distill| Comming soon |
|
| 321 |
+
|HunyuanVideo-1.5-720P-I2V-cfg-distill |[720P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled) |
|
| 322 |
+
|HunyuanVideo-1.5-720P-T2V-sparse-cfg-distill| Comming soon |
|
| 323 |
+
|HunyuanVideo-1.5-720P-I2V-sparse-cfg-distill |[720P-I2V-sparse-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled_sparse) |
|
| 324 |
+
|HunyuanVideo-1.5-720P-sr-step-distill |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
|
| 325 |
+
|HunyuanVideo-1.5-1080P-sr-step-distill |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
|
| 326 |
|
| 327 |
|
| 328 |
|
README_CN.md
CHANGED
|
@@ -24,10 +24,10 @@ HunyuanVideo-1.5作为一款轻量级视频生成模型,仅需83亿参数即
|
|
| 24 |
<a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
|
| 25 |
<a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
|
| 26 |
<a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
|
| 27 |
-
<a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/
|
| 28 |
<a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
|
| 29 |
<a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
|
| 30 |
-
<a href="
|
| 31 |
<a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
|
| 32 |
|
| 33 |
</div>
|
|
@@ -51,6 +51,8 @@ HunyuanVideo-1.5作为一款轻量级视频生成模型,仅需83亿参数即
|
|
| 51 |
如果您在项目中使用或开发了 HunyuanVideo-1.5,欢迎告知我们。
|
| 52 |
|
| 53 |
- **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): 一个强大且模块化的扩散模型图形界面,采用节点式工作流。ComfyUI 支持 HunyuanVideo-1.5,并提供多种工程加速优化以实现快速推理。
|
|
|
|
|
|
|
| 54 |
|
| 55 |
- **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): 一个轻量级高效的视频生成框架,集成了 HunyuanVideo-1.5,支持多种工程加速技术以实现快速推理。
|
| 56 |
|
|
@@ -77,6 +79,7 @@ HunyuanVideo-1.5作为一款轻量级视频生成模型,仅需83亿参数即
|
|
| 77 |
- [🔑 使用方法](#-使用方法)
|
| 78 |
- [视频生成](#视频生成)
|
| 79 |
- [命令行参数](#命令行参数)
|
|
|
|
| 80 |
- [🧱 模型卡片](#-模型卡片)
|
| 81 |
- [🎬 更多示例](#-更多示例)
|
| 82 |
- [📊 性能评估](#-性能评估)
|
|
@@ -140,7 +143,7 @@ pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-s
|
|
| 140 |
### 步骤 3:安装注意力库
|
| 141 |
|
| 142 |
* Flash Attention:
|
| 143 |
-
|
| 144 |
详细安装说明请参考 [Flash Attention](https://github.com/Dao-AILab/flash-attention)。
|
| 145 |
|
| 146 |
* Flex-Block-Attention:
|
|
@@ -152,7 +155,8 @@ pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-s
|
|
| 152 |
```
|
| 153 |
|
| 154 |
* SageAttention:
|
| 155 |
-
|
|
|
|
| 156 |
```bash
|
| 157 |
git clone https://github.com/cooper1637/SageAttention.git
|
| 158 |
cd SageAttention
|
|
@@ -211,10 +215,11 @@ OUTPUT_PATH=./outputs/output.mp4
|
|
| 211 |
# 配置
|
| 212 |
N_INFERENCE_GPU=8 # 并行推理 GPU 数量
|
| 213 |
CFG_DISTILLED=true # 使用 CFG 蒸馏模型进行推理,2倍加速
|
| 214 |
-
SPARSE_ATTN=
|
| 215 |
SAGE_ATTN=false # 使用 SageAttention 进行推理
|
| 216 |
MODEL_PATH=ckpts # 预训练模型路径
|
| 217 |
REWRITE=true # 启用提示词重写
|
|
|
|
| 218 |
|
| 219 |
torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
| 220 |
--prompt "$PROMPT" \
|
|
@@ -227,10 +232,16 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
|
| 227 |
--use_sageattn $SAGE_ATTN \
|
| 228 |
--rewrite $REWRITE \
|
| 229 |
--output_path $OUTPUT_PATH \
|
|
|
|
| 230 |
--save_pre_sr_video \
|
| 231 |
--model_path $MODEL_PATH
|
| 232 |
```
|
| 233 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 234 |
### 命令行参数
|
| 235 |
|
| 236 |
| 参数 | 类型 | 是否必需 | 默认值 | 描述 |
|
|
@@ -252,6 +263,7 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
|
| 252 |
| `--sparse_attn` | bool | 否 | `false` | 启用稀疏注意力以加速推理(约 1.5-2 倍加速,需要 H 系列 GPU,会自动启用 CFG 蒸馏,使用 `--sparse_attn` 或 `--sparse_attn true` 来启用) |
|
| 253 |
| `--offloading` | bool | 否 | `true` | 启用 CPU 卸载(使用 `--offloading false` 或 `--offloading 0` 来禁用,如果 GPU 内存允许,禁用后速度会更快) |
|
| 254 |
| `--group_offloading` | bool | 否 | `None` | 启用组卸载(默认:None,如果启用了 offloading 则自动启用。使用 `--group_offloading` 或 `--group_offloading true/1` 来启用,`--group_offloading false/0` 来禁用) |
|
|
|
|
| 255 |
| `--dtype` | str | 否 | `bf16` | Transformer 的数据类型:`bf16`(更快,内存占用更低)或 `fp32`(质量更好,速度更慢,内存占用更高) |
|
| 256 |
| `--use_sageattn` | bool | 否 | `false` | 启用 SageAttention(使用 `--use_sageattn` 或 `--use_sageattn true/1` 来启用,`--use_sageattn false/0` 来禁用) |
|
| 257 |
| `--sage_blocks_range` | str | 否 | `0-53` | SageAttention 块范围(例如:`0-5` 或 `0,1,2,3,4,5`) |
|
|
@@ -259,22 +271,43 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
|
| 259 |
|
| 260 |
**注意:** 使用 `--nproc_per_node` 指定使用的 GPU 数量。例如,`--nproc_per_node=8` 表示使用 8 个 GPU。
|
| 261 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 262 |
|
| 263 |
## 🧱 模型卡片
|
| 264 |
|模型名称| 下载链接 |
|
| 265 |
|-|---------------------------|
|
| 266 |
|HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
|
| 267 |
|HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
|
| 268 |
-
|HunyuanVideo-1.5-480P-T2V-distill | [480P-T2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
|
| 269 |
-
|HunyuanVideo-1.5-480P-I2V-distill |[480P-I2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
|
| 270 |
|HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
|
| 271 |
|HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
|
| 272 |
-
|HunyuanVideo-1.5-720P-T2V-
|
| 273 |
-
|HunyuanVideo-1.5-720P-I2V-
|
| 274 |
-
|HunyuanVideo-1.5-720P-T2V-sparse-
|
| 275 |
-
|HunyuanVideo-1.5-720P-I2V-sparse-
|
| 276 |
-
|HunyuanVideo-1.5-720P-sr |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
|
| 277 |
-
|HunyuanVideo-1.5-1080P-sr |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
|
| 278 |
|
| 279 |
|
| 280 |
|
|
|
|
| 24 |
<a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
|
| 25 |
<a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
|
| 26 |
<a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
|
| 27 |
+
<a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/report/HunyuanVideo_1_5.pdf" target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
|
| 28 |
<a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
|
| 29 |
<a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
|
| 30 |
+
<a href="./ComfyUI/README.md" target="_blank"><img src=https://img.shields.io/badge/ComfyUI-blue.svg?logo=book height=22px></a>
|
| 31 |
<a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
|
| 32 |
|
| 33 |
</div>
|
|
|
|
| 51 |
如果您在项目中使用或开发了 HunyuanVideo-1.5,欢迎告知我们。
|
| 52 |
|
| 53 |
- **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): 一个强大且模块化的扩散模型图形界面,采用节点式工作流。ComfyUI 支持 HunyuanVideo-1.5,并提供多种工程加速优化以实现快速推理。
|
| 54 |
+
我们提供了一个 [ComfyUI 使用指南](./ComfyUI/README.md) 用于 HunyuanVideo-1.5。
|
| 55 |
+
- **社区实现的 ComfyUI 插件** - [comfyui_hunyuanvideo_1.5_plugin](https://github.com/yuanyuan-spec/comfyui_hunyuanvideo_1.5_plugin): 社区实现的 HunyuanVideo-1.5 ComfyUI 插件,提供简化版和完整版节点集,支持快速使用或深度工作流定制,内置自动模型下载功能。
|
| 56 |
|
| 57 |
- **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): 一个轻量级高效的视频生成框架,集成了 HunyuanVideo-1.5,支持多种工程加速技术以实现快速推理。
|
| 58 |
|
|
|
|
| 79 |
- [🔑 使用方法](#-使用方法)
|
| 80 |
- [视频生成](#视频生成)
|
| 81 |
- [命令行参数](#命令行参数)
|
| 82 |
+
- [最优推理配置](#最优推理配置)
|
| 83 |
- [🧱 模型卡片](#-模型卡片)
|
| 84 |
- [🎬 更多示例](#-更多示例)
|
| 85 |
- [📊 性能评估](#-性能评估)
|
|
|
|
| 143 |
### 步骤 3:安装注意力库
|
| 144 |
|
| 145 |
* Flash Attention:
|
| 146 |
+
安装 Flash Attention 以实现更快的推理速度和更低的 GPU 内存消耗。
|
| 147 |
详细安装说明请参考 [Flash Attention](https://github.com/Dao-AILab/flash-attention)。
|
| 148 |
|
| 149 |
* Flex-Block-Attention:
|
|
|
|
| 155 |
```
|
| 156 |
|
| 157 |
* SageAttention:
|
| 158 |
+
要启用 SageAttention 以实现更快的推理,您需要通过以下命令安装:
|
| 159 |
+
> **注意**: 启用 SageAttention 将自动禁用 Flex-Block-Attention。
|
| 160 |
```bash
|
| 161 |
git clone https://github.com/cooper1637/SageAttention.git
|
| 162 |
cd SageAttention
|
|
|
|
| 215 |
# 配置
|
| 216 |
N_INFERENCE_GPU=8 # 并行推理 GPU 数量
|
| 217 |
CFG_DISTILLED=true # 使用 CFG 蒸馏模型进行推理,2倍加速
|
| 218 |
+
SPARSE_ATTN=false # 使用稀疏注意力进行推理
|
| 219 |
SAGE_ATTN=false # 使用 SageAttention 进行推理
|
| 220 |
MODEL_PATH=ckpts # 预训练模型路径
|
| 221 |
REWRITE=true # 启用提示词重写
|
| 222 |
+
OVERLAP_GROUP_OFFLOADING=true # 仅在组卸载启用时有效,会显著增加 CPU 内存占用,但能够提速
|
| 223 |
|
| 224 |
torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
|
| 225 |
--prompt "$PROMPT" \
|
|
|
|
| 232 |
--use_sageattn $SAGE_ATTN \
|
| 233 |
--rewrite $REWRITE \
|
| 234 |
--output_path $OUTPUT_PATH \
|
| 235 |
+
--overlap_group_offloading $OVERLAP_GROUP_OFFLOADING \
|
| 236 |
--save_pre_sr_video \
|
| 237 |
--model_path $MODEL_PATH
|
| 238 |
```
|
| 239 |
|
| 240 |
+
> **Tips:** 如果您的 GPU 内存 > 14GB 但您在生成过程中遇到 OOM (Out of Memory) 错误,可以尝试在运行前设置以��环境变量:
|
| 241 |
+
> ```bash
|
| 242 |
+
> export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True,max_split_size_mb:128
|
| 243 |
+
> ```
|
| 244 |
+
|
| 245 |
### 命令行参数
|
| 246 |
|
| 247 |
| 参数 | 类型 | 是否必需 | 默认值 | 描述 |
|
|
|
|
| 263 |
| `--sparse_attn` | bool | 否 | `false` | 启用稀疏注意力以加速推理(约 1.5-2 倍加速,需要 H 系列 GPU,会自动启用 CFG 蒸馏,使用 `--sparse_attn` 或 `--sparse_attn true` 来启用) |
|
| 264 |
| `--offloading` | bool | 否 | `true` | 启用 CPU 卸载(使用 `--offloading false` 或 `--offloading 0` 来禁用,如果 GPU 内存允许,禁用后速度会更快) |
|
| 265 |
| `--group_offloading` | bool | 否 | `None` | 启用组卸载(默认:None,如果启用了 offloading 则自动启用。使用 `--group_offloading` 或 `--group_offloading true/1` 来启用,`--group_offloading false/0` 来禁用) |
|
| 266 |
+
| `--overlap_group_offloading` | bool | 否 | `true` | 启用重叠组卸载(默认:true)。会显著增加 CPU 内存占用,但能够提速。使用 `--overlap_group_offloading` 或 `--overlap_group_offloading true/1` 来启用,`--overlap_group_offloading false/0` 来禁用 |
|
| 267 |
| `--dtype` | str | 否 | `bf16` | Transformer 的数据类型:`bf16`(更快,内存占用更低)或 `fp32`(质量更好,速度更慢,内存占用更高) |
|
| 268 |
| `--use_sageattn` | bool | 否 | `false` | 启用 SageAttention(使用 `--use_sageattn` 或 `--use_sageattn true/1` 来启用,`--use_sageattn false/0` 来禁用) |
|
| 269 |
| `--sage_blocks_range` | str | 否 | `0-53` | SageAttention 块范围(例如:`0-5` 或 `0,1,2,3,4,5`) |
|
|
|
|
| 271 |
|
| 272 |
**注意:** 使用 `--nproc_per_node` 指定使用的 GPU 数量。例如,`--nproc_per_node=8` 表示使用 8 个 GPU。
|
| 273 |
|
| 274 |
+
### 最优推理配置
|
| 275 |
+
|
| 276 |
+
下表提供了每个模型的最优推理配置(CFG 缩放、嵌入 CFG 缩放、流偏移和推理步数),以获得最佳生成质量:
|
| 277 |
+
|
| 278 |
+
| 模型 | CFG 缩放 | 嵌入 CFG 缩放 | 流偏移 | 推理步数 |
|
| 279 |
+
|-------|-----------|-------------------|------------|-----------------|
|
| 280 |
+
| 480p T2V | 6 | None | 5 | 50 |
|
| 281 |
+
| 480p I2V | 6 | None | 5 | 50 |
|
| 282 |
+
| 720p T2V | 6 | None | 9 | 50 |
|
| 283 |
+
| 720p I2V | 6 | None | 7 | 50 |
|
| 284 |
+
| 480p T2V cfg 蒸馏 | 1 | None | 5 | 50 |
|
| 285 |
+
| 480p I2V cfg 蒸馏 | 1 | None | 5 | 50 |
|
| 286 |
+
| 720p T2V cfg 蒸馏 | 1 | None | 9 | 50 |
|
| 287 |
+
| 720p I2V cfg 蒸馏 | 1 | None | 7 | 50 |
|
| 288 |
+
| 720p T2V cfg 蒸馏稀疏 | 1 | None | 7 | 50 |
|
| 289 |
+
| 720p I2V cfg 蒸馏稀疏 | 1 | None | 9 | 50 |
|
| 290 |
+
| 480→720 超分 步数蒸馏 | 1 | None | 2 | 6 |
|
| 291 |
+
| 720→1080 超分 步数蒸馏 | 1 | None | 2 | 8 |
|
| 292 |
+
|
| 293 |
+
**请注意我们提供的cfg蒸馏模型,需要50步的推理步数来获得正确的结果.**
|
| 294 |
+
|
| 295 |
|
| 296 |
## 🧱 模型卡片
|
| 297 |
|模型名称| 下载链接 |
|
| 298 |
|-|---------------------------|
|
| 299 |
|HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
|
| 300 |
|HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
|
| 301 |
+
|HunyuanVideo-1.5-480P-T2V-cfg-distill | [480P-T2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
|
| 302 |
+
|HunyuanVideo-1.5-480P-I2V-cfg-distill |[480P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
|
| 303 |
|HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
|
| 304 |
|HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
|
| 305 |
+
|HunyuanVideo-1.5-720P-T2V-cfg-distill| Comming soon |
|
| 306 |
+
|HunyuanVideo-1.5-720P-I2V-cfg-distill |[720P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled) |
|
| 307 |
+
|HunyuanVideo-1.5-720P-T2V-sparse-cfg-distill| Comming soon |
|
| 308 |
+
|HunyuanVideo-1.5-720P-I2V-sparse-cfg-distill |[720P-I2V-sparse-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled_sparse) |
|
| 309 |
+
|HunyuanVideo-1.5-720P-sr-step-distill |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
|
| 310 |
+
|HunyuanVideo-1.5-1080P-sr-step-distill |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
|
| 311 |
|
| 312 |
|
| 313 |
|