Z-Image-Fun-Controlnet-Union-2.1

Model Card
| Name |
Description |
| Z-Image-Fun-Controlnet-Union-2.1.safetensors |
ControlNet weights for Z-Image. The model supports multiple control conditions such as Canny, Depth, Pose, MLSD, Scribble, Hed and Gray. This ControlNet is added on 15 layer blocks and 2 refiner layer blocks. |
| Z-Image-Fun-Controlnet-Union-2.1-lite.safetensors |
Compared to the large version of the model, fewer layers have control added, resulting in weaker control conditions. This makes it suitable for larger control_context_scale values, and the generation results appear more natural. It is also suitable for lower-spec machines. |
| Z-Image-Fun-Controlnet-Tile-2.1.safetensors |
A Tile model trained on high-definition datasets (up to 2048Γ2048) for super-resolution. |
| Z-Image-Fun-Controlnet-Tile-2.1-lite.safetensors |
Applied control latents to fewer layers, resulting in weaker control. This allows for larger control_context_scale values with more natural results, and is also better suited for lower-spec machines. |
Model Features
- This ControlNet is added on 15 layer blocks and 2 refiner layer blocks (Lite models are added on 3 layer blocks and 2 refiner blocks). It supports multiple control conditionsβincluding Canny, Depth, Pose, MLSD, Scribble, Hed and Gray can be used like a standard ControlNet.
- Inpainting mode is also supported. When using inpaint mode, please use a larger control_context_scale, as this will result in better image continuity.
- You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 1.00.
Results
| Pose |
Output |
 |
 |
| Pose |
Output |
 |
 |
| Pose |
Output |
 |
 |
| Canny |
Output |
 |
 |
| HED |
Output |
 |
 |
| Depth |
Output |
 |
 |
| Gray |
Output |
 |
 |
| Low Resolution |
High Resolution |
 |
 |
Inference
Go to the VideoX-Fun repository for more details.
Please clone the VideoX-Fun repository and create the required directories:
git clone https://github.com/aigc-apps/VideoX-Fun.git
cd VideoX-Fun
mkdir -p models/Diffusion_Transformer
mkdir -p models/Personalized_Model
Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
π¦ models/
βββ π Diffusion_Transformer/
β βββ π Z-Image/
βββ π Personalized_Model/
β βββ π¦ Z-Image-Fun-Controlnet-Union-2.1.safetensors
β βββ π¦ Z-Image-Fun-Controlnet-Union-2.1-lite.safetensors
Then run the file examples/z_image_fun/predict_t2i_control_2.1.py and examples/z_image_fun/predict_i2i_inpaint_2.1.py.