--- license: other tags: - text-to-video --- Hunyuan1.5 use attention masks with variable-length sequences. For best performance, we recommend using an attention backend that handles padding efficiently. We recommend installing [kernels](https://github.com/huggingface/kernels) (`pip install kernels`) to access prebuilt attention kernels. You can check our [documentation](https://huggingface.co/docs/diffusers/main/en/optimization/attention_backends) to learn more about all the different attention backends we support. ```py import torch dtype = torch.bfloat16 device = "cuda:0" from diffusers import HunyuanVideo15Pipeline, attention_backend from diffusers.utils import export_to_video pipe = HunyuanVideo15Pipeline.from_pretrained("hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-720p_t2v", torch_dtype=dtype) pipe.enable_model_cpu_offload() pipe.vae.enable_tiling() generator = torch.Generator(device=device).manual_seed(seed) with attention_backend("_flash_3_hub"): # or `"flash_hub"` if you are not on H100/H800 video = pipe( prompt=prompt, generator=generator, num_frames=121, num_inference_steps=50, ).frames[0] export_to_video(video, "output.mp4", fps=24) ``` To run inference with default attention backend ```py import torch dtype = torch.bfloat16 device = "cuda:0" from diffusers import HunyuanVideo15Pipeline from diffusers.utils import export_to_video pipe = HunyuanVideo15Pipeline.from_pretrained("hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-720p_t2v", torch_dtype=dtype) pipe.enable_model_cpu_offload() pipe.vae.enable_tiling() generator = torch.Generator(device=device).manual_seed(seed) video = pipe( prompt=prompt, generator=generator, num_frames=121, num_inference_steps=50, ).frames[0] export_to_video(video, "output.mp4", fps=24) ```