--- license: apache-2.0 tags: - diffusion-single-file - comfyui - distillation - lora - video - video genration base_model: - Wan-AI/Wan2.1-T2V-14B - Wan-AI/Wan2.1-I2V-14B-480P - Wan-AI/Wan2.1-I2V-14B-720P library_name: diffusers ---
# 🎬 Wan2.1 Distilled Models ### ⚡ High-Performance Video Generation with 4-Step Inference *Distillation-accelerated versions of Wan2.1 - Dramatically faster while maintaining exceptional quality* ![image/png](https://cdn-uploads.huggingface.co/production/uploads/680de13385293771bc57400b/gXhUuWyuJpxOwGf5GQ49r.png) --- [![🤗 HuggingFace](https://img.shields.io/badge/🤗-HuggingFace-yellow)](https://huggingface.co/lightx2v/Wan2.1-Distill-Models) [![GitHub](https://img.shields.io/badge/GitHub-LightX2V-blue?logo=github)](https://github.com/ModelTC/LightX2V) [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)
--- ## 🌟 What's Special?
### ⚡ Ultra-Fast Generation - **4-step inference** (vs traditional 50+ steps) - Up to **2x faster** than ComfyUI - Real-time video generation capability ### 🎯 Flexible Options - Multiple resolutions (480P/720P) - Various precision formats (BF16/FP8/INT8) - I2V and T2V support
### 💾 Memory Efficient - FP8/INT8: **~50% size reduction** - CPU offload support - Optimized for consumer GPUs ### 🔧 Easy Integration - Compatible with LightX2V framework - ComfyUI support available - Simple configuration files
--- ## 📦 Model Catalog ### 🎥 Model Types
#### 🖼️ **Image-to-Video (I2V)** Transform still images into dynamic videos - 📺 480P Resolution - 🎬 720P Resolution #### 📝 **Text-to-Video (T2V)** Generate videos from text descriptions - 🚀 14B Parameters - 🎨 High-quality synthesis
### 🎯 Precision Variants | Precision | Model Identifier | Model Size | Framework | Quality vs Speed | |:---------:|:-----------------|:----------:|:---------:|:-----------------| | 🏆 **BF16** | `lightx2v_4step` | ~28-32 GB | LightX2V | ⭐⭐⭐⭐⭐ Highest quality | | ⚡ **FP8** | `scaled_fp8_e4m3_lightx2v_4step` | ~15-17 GB | LightX2V | ⭐⭐⭐⭐ Excellent balance | | 🎯 **INT8** | `int8_lightx2v_4step` | ~15-17 GB | LightX2V | ⭐⭐⭐⭐ Fast & efficient | | 🔷 **FP8 ComfyUI** | `scaled_fp8_e4m3_lightx2v_4step_comfyui` | ~15-17 GB | ComfyUI | ⭐⭐⭐ ComfyUI ready | ### 📝 Naming Convention ```bash # Pattern: wan2.1_{task}_{resolution}_{precision}.safetensors # Examples: wan2.1_i2v_720p_lightx2v_4step.safetensors # 720P I2V - BF16 wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors # 720P I2V - FP8 wan2.1_i2v_480p_int8_lightx2v_4step.safetensors # 480P I2V - INT8 wan2.1_t2v_14b_scaled_fp8_e4m3_lightx2v_4step_comfyui.safetensors # T2V - FP8 ComfyUI ``` > 💡 **Explore all models**: [Browse Full Model Collection →](https://huggingface.co/lightx2v/Wan2.1-Distill-Models/tree/main) ## 🚀 Usage **LightX2V is a high-performance inference framework optimized for these models, approximately 2x faster than ComfyUI with better quantization accuracy. Highly recommended!** #### Quick Start 1. Download model (720P I2V FP8 example) ```bash huggingface-cli download lightx2v/Wan2.1-Distill-Models \ --local-dir ./models/wan2.1_i2v_720p \ --include "wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors" ``` 2. Clone LightX2V repository ```bash git clone https://github.com/ModelTC/LightX2V.git cd LightX2V ``` 3. Install dependencies ```bash pip install -r requirements.txt ``` Or refer to [Quick Start Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/quickstart.md) to use docker 4. Select and modify configuration file Choose the appropriate configuration based on your GPU memory: **For 80GB+ GPU (A100/H100)** - I2V: [wan_i2v_distill_4step_cfg.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_i2v_distill_4step_cfg.json) - T2V: [wan_t2v_distill_4step_cfg.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_t2v_distill_4step_cfg.json) **For 24GB+ GPU (RTX 4090)** - I2V: [wan_i2v_distill_4step_cfg_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_i2v_distill_4step_cfg_4090.json) - T2V: [wan_t2v_distill_4step_cfg_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_t2v_distill_4step_cfg_4090.json) 5. Run inference ``` cd scripts bash wan/run_wan_i2v_distill_4step_cfg.sh ``` #### Documentation - **Quick Start Guide**: [LightX2V Quick Start](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/quickstart.md) - **Complete Usage Guide**: [LightX2V Model Structure Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/model_structure.md) - **Configuration Guide**: [Configuration Files](https://github.com/ModelTC/LightX2V/tree/main/configs/distill) - **Quantization Usage**: [Quantization Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/method_tutorials/quantization.md) - **Parameter Offload**: [Offload Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/method_tutorials/offload.md) #### Performance Advantages - ⚡ **Fast**: Approximately **2x faster** than ComfyUI - 🎯 **Optimized**: Deeply optimized for distilled models - 💾 **Memory Efficient**: Supports CPU offload and other memory optimization techniques - 🛠️ **Flexible**: Supports multiple quantization formats and configuration options ### Community - **Issues**: https://github.com/ModelTC/LightX2V/issues ## ⚠️ Important Notes 1. **Additional Components**: These models only contain DIT weights. You also need: - T5 text encoder - CLIP vision encoder - VAE encoder/decoder - Tokenizers Refer to [LightX2V Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/model_structure.md) for how to organize the complete model directory. If you find this project helpful, please give us a ⭐ on [GitHub](https://github.com/ModelTC/LightX2V)