File size: 4,626 Bytes

---
pipeline_tag: any-to-any
library_name: transformers
tags:
- text-to-image
- image-editing
- image-understanding
- vision-language
- multimodal
- unified-model
- teacher-model
- diffusion
license: mit
---

## 🌌 UniPic3-Teacher-Model
<div align="center">   
  <img src="logo.png" alt="Skywork Logo" width="500"> 
</div>

<p align="center">
  <a href="https://github.com/SkyworkAI/UniPic">
    <img src="https://img.shields.io/badge/GitHub-UniPic-blue?logo=github" alt="GitHub Repo">
  </a>
  <a href="https://github.com/SkyworkAI/UniPic/stargazers">
    <img src="https://img.shields.io/github/stars/SkyworkAI/UniPic?style=social" alt="GitHub Stars">
  </a>
  <a href="https://github.com/SkyworkAI/UniPic/network/members">
    <img src="https://img.shields.io/github/forks/SkyworkAI/UniPic?style=social" alt="GitHub Forks">
  </a>
</p>

## 📖 Introduction
<div align="center"> <img src="unipic3.png" alt="Model Teaser" width="720"> </div>

**UniPic3-Teacher-Model** is the **high-quality teacher diffusion model** used in the UniPic 3.0 framework.
It is trained with **full multi-step diffusion sampling** and optimized for **maximum perceptual quality, semantic consistency, and realism**.

This model serves as the **teacher backbone** for:
- **Distribution Matching Distillation (DMD)**
- **Consistency / trajectory distillation**
- **Few-step student model training**

Rather than being optimized for fast inference, the teacher model prioritizes **generation fidelity and stability**, providing a strong and reliable supervision signal for downstream distilled models.

---

## 🧠 Model Characteristics

- **Role**: Teacher model (not a distilled student)
- **Sampling**: Multi-step diffusion (high-fidelity)
- **Architecture**: Unified UniPic3 Transformer
- **Tasks Supported**:
  - Single-image editing
  - Multi-image composition (2–6 images)
  - Human–Object Interaction (HOI)
- **Resolution**: Flexible, within pixel budget constraints
- **Training Objective**:
  - Flow Matching / Diffusion loss
  - Used as teacher for DMD & consistency training

---

## 📊 Benchmarks
<div align="center"> <img src="unipic3_eval.png" alt="Model Teaser" width="720"> </div>

This teacher model achieves **state-of-the-art performance** on:
- Image editing benchmarks
- Multi-image composition benchmarks

It provides **high-quality supervision targets** for distilled UniPic3 student models.

---

## ⚠️ Important Note

> **This repository hosts the teacher model.**  
> It is **not optimized for few-step inference**.

If you are looking for:
- ⚡ **4–8 step fast inference**
- 🚀 **Deployment-friendly distilled models**

please refer to the **UniPic3-DMD / distilled checkpoints** instead.

---

## 🧠 Usage (Teacher Model)

### 1. Clone the Repository
```bash
git clone https://github.com/SkyworkAI/UniPic
cd UniPic-3
```

### 2. Set Up the Environment
```bash
conda create -n unipic python=3.10
conda activate unipic3
pip install -r requirements.txt
```


### 3.Batch Inference
```bash
transformer_path = "Skywork/Unipic3"

python -m torch.distributed.launch --nproc_per_node=1 --master_port 29501 --use_env \
    qwen_image_edit_fast/batch_inference.py \
    --jsonl_path data/val.jsonl \
    --output_dir work_dirs/output \
    --distributed \
    --num_inference_steps 50 \
    --true_cfg_scale 4.0 \
    --transformer transformer_path \
    --skip_existing
```

## 📄 License
This model is released under the MIT License.

## Citation
If you use Skywork-UniPic in your research, please cite:
```
@article{wang2025skywork,
  title={Skywork unipic: Unified autoregressive modeling for visual understanding and generation},
  author={Wang, Peiyu and Peng, Yi and Gan, Yimeng and Hu, Liang and Xie, Tianyidan and Wang, Xiaokun and Wei, Yichen and Tang, Chuanxin and Zhu, Bo and Li, Changshi and others},
  journal={arXiv preprint arXiv:2508.03320},
  year={2025}
}
```

```
@article{wei2025skywork,
  title={Skywork unipic 2.0: Building kontext model with online rl for unified multimodal model},
  author={Wei, Hongyang and Xu, Baixin and Liu, Hongbo and Wu, Cyrus and Liu, Jie and Peng, Yi and Wang, Peiyu and Liu, Zexiang and He, Jingwen and Xietian, Yidan and others},
  journal={arXiv preprint arXiv:2509.04548},
  year={2025}
}
```

```
@article{wei2026skywork,
  title={Skywork UniPic 3.0: Unified Multi-Image Composition via Sequence Modeling},
  author={Wei, Hongyang and Liu, Hongbo and Wang, Zidong and Peng, Yi and Xu, Baixin and Wu, Size and Zhang, Xuying and He, Xianglong and Liu, Zexiang and Wang, Peiyu and others},
  journal={arXiv preprint arXiv:2601.15664},
  year={2026}
}
```