denk commited on
Commit
664218a
·
1 Parent(s): e9b63a1
Files changed (2) hide show
  1. README.md +174 -0
  2. pytorch_lora_weights.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - video
7
+ - video-generation
8
+ - video-to-video
9
+ - diffusers
10
+ - wan2.2
11
+ ---
12
+ # Wan2.2 Video Continuation (Demo)
13
+ #### *The current project is still in development.
14
+ This repo contains the code for video continuation inference using [Wan2.2](https://github.com/Wan-Video/Wan2.2).
15
+ The main idea was taken from [LongCat-Video](https://huggingface.co/meituan-longcat/LongCat-Video).
16
+
17
+
18
+ ## Description
19
+ This is simple lora for Wan2.2TI transformer.
20
+ First test - rank = 64, alpha = 128.
21
+ It was trained using around 10k video. Input video frames 16-64 and output video frames 41-81.
22
+ Mostly attention processor has been changed for this approach.
23
+
24
+
25
+ ### Models
26
+ | Model | Best input frames count | Best output frames count | Resolution | Huggingface Link |
27
+ |-------|:-----------:|:------------------:|:------------------:|:------------------:|
28
+ | TI2V-5B | 24-32-40 | 49-61-81 | 704x1280| [Link](https://huggingface.co/TheDenk/wan2.2-video-continuation) |
29
+
30
+
31
+ ### How to
32
+ Clone repo
33
+ ```bash
34
+ git clone https://github.com/TheDenk/wan2.2-video-continuation
35
+ cd wan2.2-video-continuation
36
+ ```
37
+
38
+ Create venv
39
+ ```bash
40
+ python -m venv venv
41
+ source venv/bin/activate
42
+ ```
43
+
44
+ Install requirements
45
+ ```bash
46
+ pip install git+https://github.com/huggingface/diffusers.git
47
+ pip install -r requirements.txt
48
+ ```
49
+
50
+
51
+ ### Inference examples
52
+ #### Simple inference with cli
53
+ #### Gradio inference
54
+ ```bash
55
+ python -m inference.gradio_web_demo \
56
+ --base_model_path Wan-AI/Wan2.2-TI2V-5B-Diffusers \
57
+ --lora_path TheDenk/wan2.2-video-continuation
58
+ ```
59
+
60
+
61
+ ```bash
62
+ python -m inference.cli_demo \
63
+ --video_path "resources/ship.mp4" \
64
+ --num_input_frames 24 \
65
+ --num_output_frames 81 \
66
+ --prompt "Watercolor style, the wet suminagashi inks slowly spread into the shape of an island on the paper, with the edges continuously blending into delicate textural variations. A tiny paper boat floats in the direction of the water flow towards the still-wet areas, creating subtle ripples around it. Centered composition with soft natural light pouring in from the side, revealing subtle color gradations and a sense of movement." \
67
+ --base_model_path Wan-AI/Wan2.2-TI2V-5B-Diffusers \
68
+ --lora_path TheDenk/wan2.2-video-continuation
69
+ ```
70
+
71
+
72
+ #### Detailed Inference
73
+ ```bash
74
+ python -m inference.cli_demo \
75
+ --video_path "resources/ship.mp4" \
76
+ --num_input_frames 24 \
77
+ --num_output_frames 81 \
78
+ --prompt "Watercolor style, the wet suminagashi inks slowly spread into the shape of an island on the paper, with the edges continuously blending into delicate textural variations. A tiny paper boat floats in the direction of the water flow towards the still-wet areas, creating subtle ripples around it. Centered composition with soft natural light pouring in from the side, revealing subtle color gradations and a sense of movement." \
79
+ --base_model_path Wan-AI/Wan2.2-TI2V-5B-Diffusers \
80
+ --lora_path TheDenk/wan2.2-video-continuation \
81
+ --num_inference_steps 50 \
82
+ --guidance_scale 5.0 \
83
+ --video_height 480 \
84
+ --video_width 832 \
85
+ --negative_prompt "bad quality, low quality" \
86
+ --seed 42 \
87
+ --out_fps 24 \
88
+ --output_path "result.mp4" \
89
+ --teacache_treshold 0.5
90
+ ```
91
+
92
+
93
+ #### Minimal code example
94
+ ```python
95
+ import os
96
+ os.environ['CUDA_VISIBLE_DEVICES'] = "0"
97
+ os.environ["TOKENIZERS_PARALLELISM"] = "false"
98
+
99
+ import torch
100
+ from diffusers.utils import load_video, export_to_video
101
+ from diffusers import AutoencoderKLWan, UniPCMultistepScheduler
102
+
103
+ from wan_continuous_transformer import WanTransformer3DModel
104
+ from wan_continuous_pipeline import WanContinuousVideoPipeline
105
+
106
+ base_model_path = "Wan-AI/Wan2.2-TI2V-5B-Diffusers"
107
+ lora_path = "TheDenk/wan2.2-video-continuation"
108
+ vae = AutoencoderKLWan.from_pretrained(base_model_path, subfolder="vae", torch_dtype=torch.float32)
109
+ transformer = WanTransformer3DModel.from_pretrained(base_model_path, subfolder="transformer", torch_dtype=torch.bfloat16)
110
+
111
+ pipe = WanContinuousVideoPipeline.from_pretrained(
112
+ pretrained_model_name_or_path=base_model_path,
113
+ transformer=transformer,
114
+ vae=vae,
115
+ torch_dtype=torch.bfloat16
116
+ )
117
+ pipe.enable_model_cpu_offload()
118
+
119
+ pipe.transformer.load_lora_adapter(
120
+ lora_path,
121
+ weight_name="pytorch_lora_weights.safetensors",
122
+ adapter_name="video_continuation",
123
+ prefix=None,
124
+ )
125
+ pipe.set_adapters("video_continuation", adapter_weights=1.0)
126
+
127
+ img_h = 480 # 704 512 480
128
+ img_w = 832 # 1280 832 768
129
+
130
+ num_input_frames = 24 # 16 24 32
131
+ num_output_frames = 81 # 81 49
132
+
133
+ video_path = 'ship.mp4'
134
+ previous_video = load_video(video_path)[-num_input_frames:]
135
+
136
+ prompt = "Watercolor style, the wet suminagashi inks slowly spread into the shape of an island on the paper, with the edges continuously blending into delicate textural variations. A tiny paper boat floats in the direction of the water flow towards the still-wet areas, creating subtle ripples around it. Centered composition with soft natural light pouring in from the side, revealing subtle color gradations and a sense of movement."
137
+ negative_prompt = "bad quality, low quality"
138
+
139
+ output = pipe(
140
+ previous_video=previous_video,
141
+ prompt=prompt,
142
+ negative_prompt=negative_prompt,
143
+ height=img_h,
144
+ width=img_w,
145
+ num_frames=num_output_frames,
146
+ guidance_scale=5,
147
+ generator=torch.Generator(device="cuda").manual_seed(42),
148
+ output_type="pil",
149
+
150
+ teacache_treshold=0.4,
151
+ ).frames[0]
152
+
153
+ export_to_video(output, "output.mp4", fps=16)
154
+ ```
155
+
156
+
157
+ ## Acknowledgements
158
+ Original code and models [Wan2.2](https://github.com/Wan-Video/Wan2.2).
159
+ Video continuation approach from [LongCat-Video](https://huggingface.co/meituan-longcat/LongCat-Video).
160
+ Increase inference speed with [TeaCache](https://github.com/ali-vilab/TeaCache)
161
+
162
+ ## Citations
163
+ ```
164
+ @misc{TheDenk,
165
+ title={Wan2.2 Video Continuation},
166
+ author={Karachev Denis},
167
+ url={https://github.com/TheDenk/wan2.2-video-continuation},
168
+ publisher={Github},
169
+ year={2025}
170
+ }
171
+ ```
172
+
173
+ ## Contacts
174
+ <p>Issues should be raised directly in the repository. For professional support and recommendations please <a>welcomedenk@gmail.com</a>.</p>
pytorch_lora_weights.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5a85002cd021e54d4398dc908316d260236a1caf3d086580d67852cd416a095
3
+ size 377540272