Update README.md
Browse files
README.md
CHANGED
|
@@ -5,3 +5,111 @@
|
|
| 5 |
This code is used for editing vector sketches with text prompts.
|
| 6 |
|
| 7 |
<img src='docs/figures/teaser3.gif'>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
This code is used for editing vector sketches with text prompts.
|
| 6 |
|
| 7 |
<img src='docs/figures/teaser3.gif'>
|
| 8 |
+
|
| 9 |
+
## Outline
|
| 10 |
+
- [Installation](#installation)
|
| 11 |
+
- [Quick Start](#quick-start)
|
| 12 |
+
- [Citation](#citation)
|
| 13 |
+
|
| 14 |
+
## Installation
|
| 15 |
+
|
| 16 |
+
1. Please follow instructions in [ximinng/DiffSketcher](https://github.com/ximinng/DiffSketcher?tab=readme-ov-file#step-by-step) for the step-by-step environment preparation.
|
| 17 |
+
2. Download the [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4/tree/main) models and place them somewhere. Follow file structure [here](https://github.com/MarkMoHR/DiffSketchEdit/tree/main/StableDiffusionModels/CompVis/stable-diffusion-v1-4).
|
| 18 |
+
3. Finally, modify the directory path of your downloaded models to `huggingface_model_dict["sd14"]`([Line 11](https://github.com/MarkMoHR/DiffSketchEdit/blob/main/methods/diffusers_warp/__init__.py#L11)) of `./methods/diffusers_warp/__init__.py`.
|
| 19 |
+
|
| 20 |
+
## Quick Start
|
| 21 |
+
|
| 22 |
+
Use the code `run_painterly_render.py` and scroll to [Line 81](https://github.com/MarkMoHR/DiffSketchEdit/blob/main/run_painterly_render.py#L81). Then, modify the code according to the following instructions:
|
| 23 |
+
|
| 24 |
+
1. Set one or more seeds, or choose random ones.
|
| 25 |
+
2. Choose the editing type. `replace`, `refine` and `reweight` stand for editing modes Word Swap, Prompt Refinement and Attention Re-weighting, respectively.
|
| 26 |
+
3. Set the prompt information.
|
| 27 |
+
|
| 28 |
+
### Examples
|
| 29 |
+
|
| 30 |
+
(a) Word Swap (`replace`)
|
| 31 |
+
|
| 32 |
+
```
|
| 33 |
+
seeds_list = [25760]
|
| 34 |
+
args.edit_type = "replace"
|
| 35 |
+
|
| 36 |
+
PromptInfo(prompts=["A painting of a squirrel eating a burger",
|
| 37 |
+
"A painting of a rabbit eating a burger",
|
| 38 |
+
"A painting of a rabbit eating a pumpkin",
|
| 39 |
+
"A painting of a owl eating a pumpkin"],
|
| 40 |
+
token_ind=5,
|
| 41 |
+
changing_region_words=[["", ""], ["squirrel", "rabbit"], ["burger", "pumpkin"], ["rabbit", "owl"]])
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
- `token_ind`: indicate the index of cross-attn maps for initializing strokes.
|
| 45 |
+
- `changing_region_words`: for local editing. Type in two words to indicate the changing regions during each edit. Use empty strings for the first edit.
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
| Original image and sketch | Edited image and sketch 1 | Edited image and sketch 2 | Edited image and sketch 3 |
|
| 49 |
+
|:-------------:|:-------------------:|:----------------------:|:--------:|
|
| 50 |
+
| <img src="docs/figures/replace/ldm_generated_image0.png" style="width: 200px"> | <img src="docs/figures/replace/ldm_generated_image1.png" style="width: 200px"> | <img src="docs/figures/replace/ldm_generated_image2.png" style="width: 200px"> | <img src="docs/figures/replace/ldm_generated_image3.png" style="width: 200px"> |
|
| 51 |
+
| <img src="docs/figures/replace/visual_best-rendered0.png" style="width: 200px"> | <img src="docs/figures/replace/visual_best-rendered1.png" style="width: 200px"> | <img src="docs/figures/replace/visual_best-rendered2.png" style="width: 200px"> | <img src="docs/figures/replace/visual_best-rendered3.png" style="width: 200px"> |
|
| 52 |
+
|
| 53 |
+
(b) Prompt Refinement (`refine`)
|
| 54 |
+
|
| 55 |
+
```
|
| 56 |
+
seeds_list = [53487]
|
| 57 |
+
args.edit_type = "refine"
|
| 58 |
+
|
| 59 |
+
PromptInfo(prompts=["An evening dress",
|
| 60 |
+
"An evening dress with sleeves",
|
| 61 |
+
"An evening dress with sleeves and a belt"],
|
| 62 |
+
token_ind=3,
|
| 63 |
+
changing_region_words=[["", ""], ["", "sleeves"], ["", "belt"]]),
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
- `changing_region_words`: set an empty string for the first words.
|
| 67 |
+
|
| 68 |
+
| Original image and sketch | Edited image and sketch 1 | Edited image and sketch 2 |
|
| 69 |
+
|:-------------:|:-------------------:|:----------------------:|
|
| 70 |
+
| <img src="docs/figures/refine/ldm_generated_image0.png" style="width: 200px"> | <img src="docs/figures/refine/ldm_generated_image1.png" style="width: 200px"> | <img src="docs/figures/refine/ldm_generated_image2.png" style="width: 200px"> |
|
| 71 |
+
| <img src="docs/figures/refine/visual_best-rendered0.png" style="width: 200px"> | <img src="docs/figures/refine/visual_best-rendered1.png" style="width: 200px"> | <img src="docs/figures/refine/visual_best-rendered2.png" style="width: 200px"> |
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
(c) Attention Re-weighting (`reweight`)
|
| 75 |
+
|
| 76 |
+
```
|
| 77 |
+
seeds_list = [35491]
|
| 78 |
+
args.edit_type = "reweight"
|
| 79 |
+
|
| 80 |
+
PromptInfo(prompts=["An emoji face with moustache and smile"] * 3,
|
| 81 |
+
token_ind=3,
|
| 82 |
+
changing_region_words=[["", ""], ["moustache", "moustache"], ["smile", "smile"]],
|
| 83 |
+
reweight_word=["moustache", "smile"],
|
| 84 |
+
reweight_weight=[-1.0, 3.0]),
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
- `changing_region_words`: set the same words for each pair.
|
| 88 |
+
- `reweight_word` / `reweight_weight`: word or weight for reweighting at each edit.
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
| Original image and sketch | Edited image and sketch 1 | Edited image and sketch 2 |
|
| 92 |
+
|:-------------:|:-------------------:|:----------------------:|
|
| 93 |
+
| <img src="docs/figures/reweight/ldm_generated_image0.png" style="width: 200px"> | <img src="docs/figures/reweight/ldm_generated_image1.png" style="width: 200px"> | <img src="docs/figures/reweight/ldm_generated_image2.png" style="width: 200px"> |
|
| 94 |
+
| <img src="docs/figures/reweight/visual_best-rendered0.png" style="width: 200px"> | <img src="docs/figures/reweight/visual_best-rendered1.png" style="width: 200px"> | <img src="docs/figures/reweight/visual_best-rendered2.png" style="width: 200px"> |
|
| 95 |
+
|
| 96 |
+
|
| 97 |
+
## Acknowledgement
|
| 98 |
+
|
| 99 |
+
The project is built upon [ximinng/DiffSketcher](https://github.com/ximinng/DiffSketcher) and [google/prompt-to-prompt](https://github.com/google/prompt-to-prompt). We thank all the authors for their effort.
|
| 100 |
+
|
| 101 |
+
## Citation
|
| 102 |
+
|
| 103 |
+
If you use the code please cite:
|
| 104 |
+
|
| 105 |
+
```
|
| 106 |
+
@inproceedings{mo2024text,
|
| 107 |
+
title={Text-based Vector Sketch Editing with Image Editing Diffusion Prior},
|
| 108 |
+
author={Mo, Haoran and Gao, Chengying and Wang, Ruomei},
|
| 109 |
+
booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)},
|
| 110 |
+
pages={1--6},
|
| 111 |
+
year={2024},
|
| 112 |
+
organization={IEEE}
|
| 113 |
+
}
|
| 114 |
+
```
|
| 115 |
+
|