KevinNg99 commited on
Commit
f75dc22
·
1 Parent(s): 5bd8781

update README

Browse files
Files changed (2) hide show
  1. README.md +50 -14
  2. README_CN.md +46 -13
README.md CHANGED
@@ -40,10 +40,10 @@ HunyuanVideo-1.5 is a video generation model that delivers top-tier quality with
40
  <a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
41
  <a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
42
  <a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
43
- <a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/main/assets/HunyuanVideo_1_5.pdf" target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
44
  <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
45
  <a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
46
- <a href="https://github.com/comfyanonymous/ComfyUI" target="_blank"><img src=https://img.shields.io/badge/ComfyUI-blue.svg?logo=book height=22px></a>
47
  <a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
48
 
49
  </div>
@@ -67,7 +67,9 @@ HunyuanVideo-1.5 is a video generation model that delivers top-tier quality with
67
 
68
  If you develop/use HunyuanVideo-1.5 in your projects, welcome to let us know.
69
 
70
- - **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): A powerful and modular diffusion model GUI with a graph/nodes interface. ComfyUI supports HunyuanVideo-1.5 with various engineering optimizations for fast inference.
 
 
71
 
72
  - **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): A lightweight and efficient video generation framework that integrates HunyuanVideo-1.5, supporting multiple engineering acceleration techniques for fast inference.
73
 
@@ -95,6 +97,7 @@ If you develop/use HunyuanVideo-1.5 in your projects, welcome to let us know.
95
  - [Text to Video](#text-to-video)
96
  - [Image to Video](#image-to-video)
97
  - [Command Line Arguments](#command-line-arguments)
 
98
  - [🧱 Models Cards](#-models-cards)
99
  - [🎬 More Examples](#-more-examples)
100
  - [📊 Evaluation](#-evaluation)
@@ -157,8 +160,8 @@ pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-s
157
  ### Step 3: Install Attention Libraries
158
 
159
  * Flash Attention:
160
- It's recommended to install Flash Attention for faster inference and reduced GPU memory consumption.
161
- Detailed installation instructions are available at [Flash Attention](https://github.com/Dao-AILab/flash-attention).
162
 
163
  * Flex-Block-Attention:
164
  flex-block-attn is only required for sparse attention to achieve faster inference and can be installed by the following command:
@@ -169,6 +172,8 @@ Detailed installation instructions are available at [Flash Attention](https://gi
169
  ```
170
 
171
  * SageAttention:
 
 
172
  ```bash
173
  git clone https://github.com/cooper1637/SageAttention.git
174
  cd SageAttention
@@ -223,10 +228,11 @@ OUTPUT_PATH=./outputs/output.mp4
223
  # Configuration
224
  N_INFERENCE_GPU=8 # Parallel inference GPU count
225
  CFG_DISTILLED=true # Inference with CFG distilled model, 2x speedup
226
- SPARSE_ATTN=true # Inference with sparse attention
227
  SAGE_ATTN=false # Inference with SageAttention
228
  MODEL_PATH=ckpts # Path to pretrained model
229
  REWRITE=true # Enable prompt rewriting
 
230
 
231
  torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
232
  --prompt "$PROMPT" \
@@ -239,10 +245,18 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
239
  --use_sageattn $SAGE_ATTN \
240
  --rewrite $REWRITE \
241
  --output_path $OUTPUT_PATH \
 
242
  --save_pre_sr_video \
243
  --model_path $MODEL_PATH
244
  ```
245
 
 
 
 
 
 
 
 
246
  ### Command Line Arguments
247
 
248
  | Argument | Type | Required | Default | Description |
@@ -264,6 +278,7 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
264
  | `--sparse_attn` | bool | No | `false` | Enable sparse attention for faster inference (~1.5-2x speedup, requires H-series GPUs, auto-enables CFG distilled, use `--sparse_attn` or `--sparse_attn true` to enable) |
265
  | `--offloading` | bool | No | `true` | Enable CPU offloading (use `--offloading false` or `--offloading 0` to disable for faster inference if GPU memory allows) |
266
  | `--group_offloading` | bool | No | `None` | Enable group offloading (default: None, automatically enabled if offloading is enabled. Use `--group_offloading` or `--group_offloading true/1` to enable, `--group_offloading false/0` to disable) |
 
267
  | `--dtype` | str | No | `bf16` | Data type for transformer: `bf16` (faster, lower memory) or `fp32` (better quality, slower, higher memory) |
268
  | `--use_sageattn` | bool | No | `false` | Enable SageAttention (use `--use_sageattn` or `--use_sageattn true/1` to enable, `--use_sageattn false/0` to disable) |
269
  | `--sage_blocks_range` | str | No | `0-53` | SageAttention blocks range (e.g., `0-5` or `0,1,2,3,4,5`) |
@@ -271,22 +286,43 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
271
 
272
  **Note:** Use `--nproc_per_node` to specify the number of GPUs. For example, `--nproc_per_node=8` uses 8 GPUs.
273
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
274
 
275
  ## 🧱 Models Cards
276
  |ModelName| Download |
277
  |-|---------------------------|
278
  |HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
279
  |HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
280
- |HunyuanVideo-1.5-480P-T2V-distill | [480P-T2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
281
- |HunyuanVideo-1.5-480P-I2V-distill |[480P-I2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
282
  |HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
283
  |HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
284
- |HunyuanVideo-1.5-720P-T2V-distiill| Comming soon |
285
- |HunyuanVideo-1.5-720P-I2V-distiill |[720P-I2V-distiill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled) |
286
- |HunyuanVideo-1.5-720P-T2V-sparse-distiill| Comming soon |
287
- |HunyuanVideo-1.5-720P-I2V-sparse-distiill |[720P-I2V-sparse-distiill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled_sparse) |
288
- |HunyuanVideo-1.5-720P-sr |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
289
- |HunyuanVideo-1.5-1080P-sr |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
290
 
291
 
292
 
 
40
  <a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
41
  <a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
42
  <a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
43
+ <a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/report/HunyuanVideo_1_5.pdf" target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
44
  <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
45
  <a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
46
+ <a href="./ComfyUI/README.md" target="_blank"><img src=https://img.shields.io/badge/ComfyUI-blue.svg?logo=book height=22px></a>
47
  <a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
48
 
49
  </div>
 
67
 
68
  If you develop/use HunyuanVideo-1.5 in your projects, welcome to let us know.
69
 
70
+ - **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): A powerful and modular diffusion model GUI with a graph/nodes interface. ComfyUI supports HunyuanVideo-1.5 with various engineering optimizations for fast inference. We provide a [ComfyUI Usage Guide](./ComfyUI/README.md) for HunyuanVideo-1.5.
71
+
72
+ - **Community-implemented ComfyUI Plugin** - [comfyui_hunyuanvideo_1.5_plugin](https://github.com/yuanyuan-spec/comfyui_hunyuanvideo_1.5_plugin): A community-implemented ComfyUI plugin for HunyuanVideo-1.5, offering both simplified and complete node sets for quick usage or deep workflow customization, with built-in automatic model download support.
73
 
74
  - **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): A lightweight and efficient video generation framework that integrates HunyuanVideo-1.5, supporting multiple engineering acceleration techniques for fast inference.
75
 
 
97
  - [Text to Video](#text-to-video)
98
  - [Image to Video](#image-to-video)
99
  - [Command Line Arguments](#command-line-arguments)
100
+ - [Optimal Inference Configurations](#optimal-inference-configurations)
101
  - [🧱 Models Cards](#-models-cards)
102
  - [🎬 More Examples](#-more-examples)
103
  - [📊 Evaluation](#-evaluation)
 
160
  ### Step 3: Install Attention Libraries
161
 
162
  * Flash Attention:
163
+ Install Flash Attention for faster inference and reduced GPU memory consumption.
164
+ Detailed installation instructions are available at [Flash Attention](https://github.com/Dao-AILab/flash-attention).
165
 
166
  * Flex-Block-Attention:
167
  flex-block-attn is only required for sparse attention to achieve faster inference and can be installed by the following command:
 
172
  ```
173
 
174
  * SageAttention:
175
+ To enable SageAttention for faster inference, you need to install it by the following command:
176
+ > **Note**: Enabling SageAttention will automatically disable Flex-Block-Attention.
177
  ```bash
178
  git clone https://github.com/cooper1637/SageAttention.git
179
  cd SageAttention
 
228
  # Configuration
229
  N_INFERENCE_GPU=8 # Parallel inference GPU count
230
  CFG_DISTILLED=true # Inference with CFG distilled model, 2x speedup
231
+ SPARSE_ATTN=false # Inference with sparse attention
232
  SAGE_ATTN=false # Inference with SageAttention
233
  MODEL_PATH=ckpts # Path to pretrained model
234
  REWRITE=true # Enable prompt rewriting
235
+ OVERLAP_GROUP_OFFLOADING=true # Only valid when group offloading is enabled, significantly increases CPU memory usage but speeds up inference
236
 
237
  torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
238
  --prompt "$PROMPT" \
 
245
  --use_sageattn $SAGE_ATTN \
246
  --rewrite $REWRITE \
247
  --output_path $OUTPUT_PATH \
248
+ --overlap_group_offloading $OVERLAP_GROUP_OFFLOADING \
249
  --save_pre_sr_video \
250
  --model_path $MODEL_PATH
251
  ```
252
 
253
+ > **Tips:** If your GPU memory is > 14GB but you encounter OOM (Out of Memory) errors during generation, you can try setting the following environment variable before running:
254
+ > ```bash
255
+ > export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True,max_split_size_mb:128
256
+ > ```
257
+
258
+
259
+
260
  ### Command Line Arguments
261
 
262
  | Argument | Type | Required | Default | Description |
 
278
  | `--sparse_attn` | bool | No | `false` | Enable sparse attention for faster inference (~1.5-2x speedup, requires H-series GPUs, auto-enables CFG distilled, use `--sparse_attn` or `--sparse_attn true` to enable) |
279
  | `--offloading` | bool | No | `true` | Enable CPU offloading (use `--offloading false` or `--offloading 0` to disable for faster inference if GPU memory allows) |
280
  | `--group_offloading` | bool | No | `None` | Enable group offloading (default: None, automatically enabled if offloading is enabled. Use `--group_offloading` or `--group_offloading true/1` to enable, `--group_offloading false/0` to disable) |
281
+ | `--overlap_group_offloading` | bool | No | `true` | Enable overlap group offloading (default: true). Significantly increases CPU memory usage but speeds up inference. Use `--overlap_group_offloading` or `--overlap_group_offloading true/1` to enable, `--overlap_group_offloading false/0` to disable |
282
  | `--dtype` | str | No | `bf16` | Data type for transformer: `bf16` (faster, lower memory) or `fp32` (better quality, slower, higher memory) |
283
  | `--use_sageattn` | bool | No | `false` | Enable SageAttention (use `--use_sageattn` or `--use_sageattn true/1` to enable, `--use_sageattn false/0` to disable) |
284
  | `--sage_blocks_range` | str | No | `0-53` | SageAttention blocks range (e.g., `0-5` or `0,1,2,3,4,5`) |
 
286
 
287
  **Note:** Use `--nproc_per_node` to specify the number of GPUs. For example, `--nproc_per_node=8` uses 8 GPUs.
288
 
289
+ ### Optimal Inference Configurations
290
+
291
+ The following table provides the optimal inference configurations (CFG scale, embedded CFG scale, flow shift, and inference steps) for each model to achieve the best generation quality:
292
+
293
+ | Model | CFG Scale | Embedded CFG Scale | Flow Shift | Inference Steps |
294
+ |-------|-----------|-------------------|------------|-----------------|
295
+ | 480p T2V | 6 | None | 5 | 50 |
296
+ | 480p I2V | 6 | None | 5 | 50 |
297
+ | 720p T2V | 6 | None | 9 | 50 |
298
+ | 720p I2V | 6 | None | 7 | 50 |
299
+ | 480p T2V CFG Distilled | 1 | None | 5 | 50 |
300
+ | 480p I2V CFG Distilled | 1 | None | 5 | 50 |
301
+ | 720p T2V CFG Distilled | 1 | None | 9 | 50 |
302
+ | 720p I2V CFG Distilled | 1 | None | 7 | 50 |
303
+ | 720p T2V CFG Distilled Sparse | 1 | None | 7 | 50 |
304
+ | 720p I2V CFG Distilled Sparse | 1 | None | 9 | 50 |
305
+ | 480→720 SR Step Distilled | 1 | None | 2 | 6 |
306
+ | 720→1080 SR Step Distilled | 1 | None | 2 | 8 |
307
+
308
+ **Please note that the cfg distilled model we provided, must use 50 steps to generate correct results.**
309
+
310
 
311
  ## 🧱 Models Cards
312
  |ModelName| Download |
313
  |-|---------------------------|
314
  |HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
315
  |HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
316
+ |HunyuanVideo-1.5-480P-T2V-cfg-distill | [480P-T2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
317
+ |HunyuanVideo-1.5-480P-I2V-cfg-distill |[480P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
318
  |HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
319
  |HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
320
+ |HunyuanVideo-1.5-720P-T2V-cfg-distill| Comming soon |
321
+ |HunyuanVideo-1.5-720P-I2V-cfg-distill |[720P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled) |
322
+ |HunyuanVideo-1.5-720P-T2V-sparse-cfg-distill| Comming soon |
323
+ |HunyuanVideo-1.5-720P-I2V-sparse-cfg-distill |[720P-I2V-sparse-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled_sparse) |
324
+ |HunyuanVideo-1.5-720P-sr-step-distill |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
325
+ |HunyuanVideo-1.5-1080P-sr-step-distill |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
326
 
327
 
328
 
README_CN.md CHANGED
@@ -24,10 +24,10 @@ HunyuanVideo-1.5作为一款轻量级视频生成模型,仅需83亿参数即
24
  <a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
25
  <a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
26
  <a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
27
- <a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/main/assets/HunyuanVideo_1_5.pdf" target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
28
  <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
29
  <a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
30
- <a href="https://github.com/comfyanonymous/ComfyUI" target="_blank"><img src=https://img.shields.io/badge/ComfyUI-blue.svg?logo=book height=22px></a>
31
  <a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
32
 
33
  </div>
@@ -51,6 +51,8 @@ HunyuanVideo-1.5作为一款轻量级视频生成模型,仅需83亿参数即
51
  如果您在项目中使用或开发了 HunyuanVideo-1.5,欢迎告知我们。
52
 
53
  - **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): 一个强大且模块化的扩散模型图形界面,采用节点式工作流。ComfyUI 支持 HunyuanVideo-1.5,并提供多种工程加速优化以实现快速推理。
 
 
54
 
55
  - **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): 一个轻量级高效的视频生成框架,集成了 HunyuanVideo-1.5,支持多种工程加速技术以实现快速推理。
56
 
@@ -77,6 +79,7 @@ HunyuanVideo-1.5作为一款轻量级视频生成模型,仅需83亿参数即
77
  - [🔑 使用方法](#-使用方法)
78
  - [视频生成](#视频生成)
79
  - [命令行参数](#命令行参数)
 
80
  - [🧱 模型卡片](#-模型卡片)
81
  - [🎬 更多示例](#-更多示例)
82
  - [📊 性能评估](#-性能评估)
@@ -140,7 +143,7 @@ pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-s
140
  ### 步骤 3:安装注意力库
141
 
142
  * Flash Attention:
143
- 建议安装 Flash Attention 以实现更快的推理速度和更低的 GPU 内存消耗。
144
  详细安装说明请参考 [Flash Attention](https://github.com/Dao-AILab/flash-attention)。
145
 
146
  * Flex-Block-Attention:
@@ -152,7 +155,8 @@ pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-s
152
  ```
153
 
154
  * SageAttention:
155
-
 
156
  ```bash
157
  git clone https://github.com/cooper1637/SageAttention.git
158
  cd SageAttention
@@ -211,10 +215,11 @@ OUTPUT_PATH=./outputs/output.mp4
211
  # 配置
212
  N_INFERENCE_GPU=8 # 并行推理 GPU 数量
213
  CFG_DISTILLED=true # 使用 CFG 蒸馏模型进行推理,2倍加速
214
- SPARSE_ATTN=true # 使用稀疏注意力进行推理
215
  SAGE_ATTN=false # 使用 SageAttention 进行推理
216
  MODEL_PATH=ckpts # 预训练模型路径
217
  REWRITE=true # 启用提示词重写
 
218
 
219
  torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
220
  --prompt "$PROMPT" \
@@ -227,10 +232,16 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
227
  --use_sageattn $SAGE_ATTN \
228
  --rewrite $REWRITE \
229
  --output_path $OUTPUT_PATH \
 
230
  --save_pre_sr_video \
231
  --model_path $MODEL_PATH
232
  ```
233
 
 
 
 
 
 
234
  ### 命令行参数
235
 
236
  | 参数 | 类型 | 是否必需 | 默认值 | 描述 |
@@ -252,6 +263,7 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
252
  | `--sparse_attn` | bool | 否 | `false` | 启用稀疏注意力以加速推理(约 1.5-2 倍加速,需要 H 系列 GPU,会自动启用 CFG 蒸馏,使用 `--sparse_attn` 或 `--sparse_attn true` 来启用) |
253
  | `--offloading` | bool | 否 | `true` | 启用 CPU 卸载(使用 `--offloading false` 或 `--offloading 0` 来禁用,如果 GPU 内存允许,禁用后速度会更快) |
254
  | `--group_offloading` | bool | 否 | `None` | 启用组卸载(默认:None,如果启用了 offloading 则自动启用。使用 `--group_offloading` 或 `--group_offloading true/1` 来启用,`--group_offloading false/0` 来禁用) |
 
255
  | `--dtype` | str | 否 | `bf16` | Transformer 的数据类型:`bf16`(更快,内存占用更低)或 `fp32`(质量更好,速度更慢,内存占用更高) |
256
  | `--use_sageattn` | bool | 否 | `false` | 启用 SageAttention(使用 `--use_sageattn` 或 `--use_sageattn true/1` 来启用,`--use_sageattn false/0` 来禁用) |
257
  | `--sage_blocks_range` | str | 否 | `0-53` | SageAttention 块范围(例如:`0-5` 或 `0,1,2,3,4,5`) |
@@ -259,22 +271,43 @@ torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
259
 
260
  **注意:** 使用 `--nproc_per_node` 指定使用的 GPU 数量。例如,`--nproc_per_node=8` 表示使用 8 个 GPU。
261
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
262
 
263
  ## 🧱 模型卡片
264
  |模型名称| 下载链接 |
265
  |-|---------------------------|
266
  |HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
267
  |HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
268
- |HunyuanVideo-1.5-480P-T2V-distill | [480P-T2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
269
- |HunyuanVideo-1.5-480P-I2V-distill |[480P-I2V-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
270
  |HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
271
  |HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
272
- |HunyuanVideo-1.5-720P-T2V-distiill| Comming soon |
273
- |HunyuanVideo-1.5-720P-I2V-distiill |[720P-I2V-distiill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled) |
274
- |HunyuanVideo-1.5-720P-T2V-sparse-distiill| Comming soon |
275
- |HunyuanVideo-1.5-720P-I2V-sparse-distiill |[720P-I2V-sparse-distiill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled_sparse) |
276
- |HunyuanVideo-1.5-720P-sr |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
277
- |HunyuanVideo-1.5-1080P-sr |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
278
 
279
 
280
 
 
24
  <a href="https://hunyuan.tencent.com/video/zh?tabIndex=0" target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
25
  <a href=https://huggingface.co/tencent/HunyuanVideo-1.5 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
26
  <a href=https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
27
+ <a href="https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5/blob/report/HunyuanVideo_1_5.pdf" target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
28
  <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
29
  <a href="https://doc.weixin.qq.com/doc/w3_AXcAcwZSAGgCNACVygLxeQjyn4FYS?scode=AJEAIQdfAAoSfXnTj0AAkA-gaeACk" target="_blank"><img src=https://img.shields.io/badge/📚-PromptHandBook-blue.svg?logo=book height=22px></a> <br/>
30
+ <a href="./ComfyUI/README.md" target="_blank"><img src=https://img.shields.io/badge/ComfyUI-blue.svg?logo=book height=22px></a>
31
  <a href="https://github.com/ModelTC/LightX2V" target="_blank"><img src=https://img.shields.io/badge/LightX2V-yellow.svg?logo=book height=22px></a>
32
 
33
  </div>
 
51
  如果您在项目中使用或开发了 HunyuanVideo-1.5,欢迎告知我们。
52
 
53
  - **ComfyUI** - [ComfyUI](https://github.com/comfyanonymous/ComfyUI): 一个强大且模块化的扩散模型图形界面,采用节点式工作流。ComfyUI 支持 HunyuanVideo-1.5,并提供多种工程加速优化以实现快速推理。
54
+ 我们提供了一个 [ComfyUI 使用指南](./ComfyUI/README.md) 用于 HunyuanVideo-1.5。
55
+ - **社区实现的 ComfyUI 插件** - [comfyui_hunyuanvideo_1.5_plugin](https://github.com/yuanyuan-spec/comfyui_hunyuanvideo_1.5_plugin): 社区实现的 HunyuanVideo-1.5 ComfyUI 插件,提供简化版和完整版节点集,支持快速使用或深度工作流定制,内置自动模型下载功能。
56
 
57
  - **LightX2V** - [LightX2V](https://github.com/ModelTC/LightX2V): 一个轻量级高效的视频生成框架,集成了 HunyuanVideo-1.5,支持多种工程加速技术以实现快速推理。
58
 
 
79
  - [🔑 使用方法](#-使用方法)
80
  - [视频生成](#视频生成)
81
  - [命令行参数](#命令行参数)
82
+ - [最优推理配置](#最优推理配置)
83
  - [🧱 模型卡片](#-模型卡片)
84
  - [🎬 更多示例](#-更多示例)
85
  - [📊 性能评估](#-性能评估)
 
143
  ### 步骤 3:安装注意力库
144
 
145
  * Flash Attention:
146
+ 安装 Flash Attention 以实现更快的推理速度和更低的 GPU 内存消耗。
147
  详细安装说明请参考 [Flash Attention](https://github.com/Dao-AILab/flash-attention)。
148
 
149
  * Flex-Block-Attention:
 
155
  ```
156
 
157
  * SageAttention:
158
+ 要启用 SageAttention 以实现更快的推理,您需要通过以下命令安装:
159
+ > **注意**: 启用 SageAttention 将自动禁用 Flex-Block-Attention。
160
  ```bash
161
  git clone https://github.com/cooper1637/SageAttention.git
162
  cd SageAttention
 
215
  # 配置
216
  N_INFERENCE_GPU=8 # 并行推理 GPU 数量
217
  CFG_DISTILLED=true # 使用 CFG 蒸馏模型进行推理,2倍加速
218
+ SPARSE_ATTN=false # 使用稀疏注意力进行推理
219
  SAGE_ATTN=false # 使用 SageAttention 进行推理
220
  MODEL_PATH=ckpts # 预训练模型路径
221
  REWRITE=true # 启用提示词重写
222
+ OVERLAP_GROUP_OFFLOADING=true # 仅在组卸载启用时有效,会显著增加 CPU 内存占用,但能够提速
223
 
224
  torchrun --nproc_per_node=$N_INFERENCE_GPU generate.py \
225
  --prompt "$PROMPT" \
 
232
  --use_sageattn $SAGE_ATTN \
233
  --rewrite $REWRITE \
234
  --output_path $OUTPUT_PATH \
235
+ --overlap_group_offloading $OVERLAP_GROUP_OFFLOADING \
236
  --save_pre_sr_video \
237
  --model_path $MODEL_PATH
238
  ```
239
 
240
+ > **Tips:** 如果您的 GPU 内存 > 14GB 但您在生成过程中遇到 OOM (Out of Memory) 错误,可以尝试在运行前设置以��环境变量:
241
+ > ```bash
242
+ > export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True,max_split_size_mb:128
243
+ > ```
244
+
245
  ### 命令行参数
246
 
247
  | 参数 | 类型 | 是否必需 | 默认值 | 描述 |
 
263
  | `--sparse_attn` | bool | 否 | `false` | 启用稀疏注意力以加速推理(约 1.5-2 倍加速,需要 H 系列 GPU,会自动启用 CFG 蒸馏,使用 `--sparse_attn` 或 `--sparse_attn true` 来启用) |
264
  | `--offloading` | bool | 否 | `true` | 启用 CPU 卸载(使用 `--offloading false` 或 `--offloading 0` 来禁用,如果 GPU 内存允许,禁用后速度会更快) |
265
  | `--group_offloading` | bool | 否 | `None` | 启用组卸载(默认:None,如果启用了 offloading 则自动启用。使用 `--group_offloading` 或 `--group_offloading true/1` 来启用,`--group_offloading false/0` 来禁用) |
266
+ | `--overlap_group_offloading` | bool | 否 | `true` | 启用重叠组卸载(默认:true)。会显著增加 CPU 内存占用,但能够提速。使用 `--overlap_group_offloading` 或 `--overlap_group_offloading true/1` 来启用,`--overlap_group_offloading false/0` 来禁用 |
267
  | `--dtype` | str | 否 | `bf16` | Transformer 的数据类型:`bf16`(更快,内存占用更低)或 `fp32`(质量更好,速度更慢,内存占用更高) |
268
  | `--use_sageattn` | bool | 否 | `false` | 启用 SageAttention(使用 `--use_sageattn` 或 `--use_sageattn true/1` 来启用,`--use_sageattn false/0` 来禁用) |
269
  | `--sage_blocks_range` | str | 否 | `0-53` | SageAttention 块范围(例如:`0-5` 或 `0,1,2,3,4,5`) |
 
271
 
272
  **注意:** 使用 `--nproc_per_node` 指定使用的 GPU 数量。例如,`--nproc_per_node=8` 表示使用 8 个 GPU。
273
 
274
+ ### 最优推理配置
275
+
276
+ 下表提供了每个模型的最优推理配置(CFG 缩放、嵌入 CFG 缩放、流偏移和推理步数),以获得最佳生成质量:
277
+
278
+ | 模型 | CFG 缩放 | 嵌入 CFG 缩放 | 流偏移 | 推理步数 |
279
+ |-------|-----------|-------------------|------------|-----------------|
280
+ | 480p T2V | 6 | None | 5 | 50 |
281
+ | 480p I2V | 6 | None | 5 | 50 |
282
+ | 720p T2V | 6 | None | 9 | 50 |
283
+ | 720p I2V | 6 | None | 7 | 50 |
284
+ | 480p T2V cfg 蒸馏 | 1 | None | 5 | 50 |
285
+ | 480p I2V cfg 蒸馏 | 1 | None | 5 | 50 |
286
+ | 720p T2V cfg 蒸馏 | 1 | None | 9 | 50 |
287
+ | 720p I2V cfg 蒸馏 | 1 | None | 7 | 50 |
288
+ | 720p T2V cfg 蒸馏稀疏 | 1 | None | 7 | 50 |
289
+ | 720p I2V cfg 蒸馏稀疏 | 1 | None | 9 | 50 |
290
+ | 480→720 超分 步数蒸馏 | 1 | None | 2 | 6 |
291
+ | 720→1080 超分 步数蒸馏 | 1 | None | 2 | 8 |
292
+
293
+ **请注意我们提供的cfg蒸馏模型,需要50步的推理步数来获得正确的结果.**
294
+
295
 
296
  ## 🧱 模型卡片
297
  |模型名称| 下载链接 |
298
  |-|---------------------------|
299
  |HunyuanVideo-1.5-480P-T2V|[480P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v) |
300
  |HunyuanVideo-1.5-480P-I2V |[480P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v) |
301
+ |HunyuanVideo-1.5-480P-T2V-cfg-distill | [480P-T2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_t2v_distilled) |
302
+ |HunyuanVideo-1.5-480P-I2V-cfg-distill |[480P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_distilled) |
303
  |HunyuanVideo-1.5-720P-T2V|[720P-T2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_t2v) |
304
  |HunyuanVideo-1.5-720P-I2V |[720P-I2V](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v) |
305
+ |HunyuanVideo-1.5-720P-T2V-cfg-distill| Comming soon |
306
+ |HunyuanVideo-1.5-720P-I2V-cfg-distill |[720P-I2V-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled) |
307
+ |HunyuanVideo-1.5-720P-T2V-sparse-cfg-distill| Comming soon |
308
+ |HunyuanVideo-1.5-720P-I2V-sparse-cfg-distill |[720P-I2V-sparse-cfg-distill](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_i2v_distilled_sparse) |
309
+ |HunyuanVideo-1.5-720P-sr-step-distill |[720P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/720p_sr_distilled) |
310
+ |HunyuanVideo-1.5-1080P-sr-step-distill |[1080P-sr](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/1080p_sr_distilled) |
311
 
312
 
313