Qwen-Image-Edit-F2P / README_from_modelscope.md

Upload folder using huggingface_hub

c06e0c5 verified 3 months ago

4.07 kB

	---
	frameworks:
	- Pytorch
	license: Apache License 2.0
	tags: []
	tasks:
	- image-to-image
	base_model:
	- Qwen/Qwen-Image-Edit
	base_model_relation: adapter
	---

	# Qwen-Image-Edit 人脸生成图像模型
	## 模型介绍

	本模型是基于 [Qwen-Image-Edit](https://www.modelscope.cn/models/Qwen/Qwen-Image-Edit) 人脸控制图像生成模型。输入裁剪下的人脸图像，输出该人的人像图片。

	## 效果展示

	\|人脸\|生成图1\|生成图2\|生成图3\|生成图4\|
	\|-\|-\|-\|-\|-\|
	\|![](./assets/qwen_woman_face_crop.png)\|![](./assets/qwen_woman_0.jpg)\|![](./assets/qwen_woman_1.jpg)\|![](./assets/qwen_woman_2.jpg)\|![](./assets/qwen_woman_3.jpg)\|



	## 推理代码
	```
	git clone https://github.com/modelscope/DiffSynth-Studio.git
	cd DiffSynth-Studio
	pip install -e .
	```

	```python
	from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig
	import torch
	from modelscope import snapshot_download, dataset_snapshot_download
	from PIL import Image

	pipe = QwenImagePipeline.from_pretrained(
	torch_dtype=torch.bfloat16,
	device="cuda",
	model_configs=[
	ModelConfig(model_id="Qwen/Qwen-Image-Edit", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"),
	ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"),
	ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
	],
	tokenizer_config=None,
	processor_config=ModelConfig(model_id="Qwen/Qwen-Image-Edit", origin_file_pattern="processor/"),
	)
	snapshot_download("DiffSynth-Studio/Qwen-Image-Edit-F2P", local_dir="models/DiffSynth-Studio/Qwen-Image-Edit-F2P", allow_file_pattern="model.safetensors")
	pipe.load_lora(pipe.dit, "models/DiffSynth-Studio/Qwen-Image-Edit-F2P/model.safetensors")
	dataset_snapshot_download(
	dataset_id="DiffSynth-Studio/example_image_dataset",
	local_dir="./data/example_image_dataset",
	allow_file_pattern="f2p/qwen_woman_face_crop.png"
	)
	face_image = Image.open("data/example_image_dataset/f2p/qwen_woman_face_crop.png").convert("RGB")

	prompt = "摄影。一个年轻女性穿着黄色连衣裙，站在花田中，背景是五颜六色的花朵和绿色的草地。"
	image = pipe(prompt, edit_image=face_image, seed=42, num_inference_steps=40, height=1152, width=864)
	image.save(f"image.jpg")
	```
	人脸自动裁剪
	```python
	import torch
	from PIL import Image
	import numpy as np
	from insightface.app import FaceAnalysis
	import cv2

	class FaceDetector(torch.nn.Module):

	def __init__(self):
	super().__init__()
	providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
	provider_options = [{"device_id": 0}, {}]
	self.app_640 = FaceAnalysis(name='antelopev2', providers=providers, provider_options=provider_options)
	self.app_640.prepare(ctx_id=0, det_size=(640, 640))
	self.app_320 = FaceAnalysis(name='antelopev2', providers=providers, provider_options=provider_options)
	self.app_320.prepare(ctx_id=0, det_size=(320, 320))
	self.app_160 = FaceAnalysis(name='antelopev2', providers=providers, provider_options=provider_options)
	self.app_160.prepare(ctx_id=0, det_size=(160, 160))

	def _detect_face(self, id_image_cv2):
	face_info = self.app_640.get(id_image_cv2)
	if len(face_info) > 0:
	return face_info
	face_info = self.app_320.get(id_image_cv2)
	if len(face_info) > 0:
	return face_info
	face_info = self.app_160.get(id_image_cv2)
	return face_info

	def crop_face(self, id_image):
	face_info = self._detect_face(cv2.cvtColor(np.array(id_image), cv2.COLOR_RGB2BGR))
	if len(face_info) == 0:
	return None
	else:
	bbox = sorted(face_info, key=lambda x: (x['bbox'][2] - x['bbox'][0]) * (x['bbox'][3] - x['bbox'][1]))[-1]['bbox']
	return id_image.crop(list(map(int, bbox)))


	face_detector = FaceDetector()
	face_image = face_detector.crop_face(Image.open("image_2.jpg"))
	face_image.save("face_crop.jpg")

	```