Gerold Meisinger
commited on
Commit
·
d3bbaf6
1
Parent(s):
0bb179b
control-edgedrawing-cv480edpf-fastdup-fp16-checkpoint-45000
Browse files
README.md
CHANGED
|
@@ -8,18 +8,20 @@ tags:
|
|
| 8 |
- controlnet
|
| 9 |
---
|
| 10 |
|
| 11 |
-
Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (_ed_), parameter free (_edpf_), color (_edcolor_).
|
| 12 |
|
| 13 |
* Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
|
| 14 |
* For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
|
| 15 |
-
* To generate edpf maps you can use
|
| 16 |
-
* For evaluation see the corresponding .zip with images
|
| 17 |
-
* To run your own evaluations you can use
|
| 18 |
|
| 19 |
**Edge Drawing Parameter Free**
|
| 20 |
|
| 21 |

|
| 22 |
|
|
|
|
|
|
|
| 23 |
**Example**
|
| 24 |
|
| 25 |
sampler=UniPC steps=20 cfg=7.5 seed=0 batch=9 model: v1-5-pruned-emaonly.safetensors cherry-picked: 1/9
|
|
@@ -28,8 +30,6 @@ prompt: _a detailed high-quality professional photo of swedish woman standing in
|
|
| 28 |
|
| 29 |

|
| 30 |
|
| 31 |
-
_Clear and pristine! Wooow!_
|
| 32 |
-
|
| 33 |
**Canndy Edge for comparison (default in Automatic1111)**
|
| 34 |
|
| 35 |

|
|
@@ -66,10 +66,10 @@ accelerate launch train_controlnet.py ^
|
|
| 66 |
To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
|
| 67 |
* canny 1.0 model was trained on 3M images with fp32, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k with fp16.
|
| 68 |
* canny edge-detector requires parameter tuning while edpf is parameter free.
|
| 69 |
-
*
|
| 70 |
-
* Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model? (2023-09-25: see `eval_canny_edpf.zip` but it seems as if it doesn't work and the edpf model may be justified)
|
| 71 |
* When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands, and don't attribute them to the control net.
|
| 72 |
-
* When evaluating style we need to be aware of the bias from the image dataset (`laion2b-en-aesthetics65`), which might tend to
|
| 73 |
|
| 74 |
# Versions
|
| 75 |
|
|
@@ -129,16 +129,50 @@ see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` => re
|
|
| 129 |
|
| 130 |
**Experiment 4.1 - 2023-09-26 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-90000**
|
| 131 |
|
| 132 |
-
resumed from 45000 steps with left-right flipped images => results are still not good, 50% is probably too much for
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 133 |
|
| 134 |
# Ideas
|
| 135 |
|
| 136 |
-
*
|
| 137 |
-
* cleanup image dataset (l65)
|
| 138 |
-
* uncropped mod64 images
|
| 139 |
* integrate edcolor
|
| 140 |
-
*
|
| 141 |
-
*
|
|
|
|
| 142 |
* re-train with fp32
|
| 143 |
|
| 144 |
# Question and answers
|
|
|
|
| 8 |
- controlnet
|
| 9 |
---
|
| 10 |
|
| 11 |
+
Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Note that Edge Drawing comes in different flavors: original (_ed_), parameter free (_edpf_), color (_edcolor_).
|
| 12 |
|
| 13 |
* Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
|
| 14 |
* For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
|
| 15 |
+
* To generate edpf maps you can use [this space](https://huggingface.co/spaces/GeroldMeisinger/edpf) or [this script gitlab.com](https://gitlab.com/-/snippets/3601881).
|
| 16 |
+
* For evaluation see the corresponding .zip with images at the files.
|
| 17 |
+
* To run your own evaluations you can use [this script at gitlab.com](https://gitlab.com/-/snippets/3602096).
|
| 18 |
|
| 19 |
**Edge Drawing Parameter Free**
|
| 20 |
|
| 21 |

|
| 22 |
|
| 23 |
+
_Clear and pristine! Wooow!_
|
| 24 |
+
|
| 25 |
**Example**
|
| 26 |
|
| 27 |
sampler=UniPC steps=20 cfg=7.5 seed=0 batch=9 model: v1-5-pruned-emaonly.safetensors cherry-picked: 1/9
|
|
|
|
| 30 |
|
| 31 |

|
| 32 |
|
|
|
|
|
|
|
| 33 |
**Canndy Edge for comparison (default in Automatic1111)**
|
| 34 |
|
| 35 |

|
|
|
|
| 66 |
To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
|
| 67 |
* canny 1.0 model was trained on 3M images with fp32, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k with fp16.
|
| 68 |
* canny edge-detector requires parameter tuning while edpf is parameter free.
|
| 69 |
+
* Should we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
|
| 70 |
+
* Would the canny model actually benefit from a edpf pre-processor and we might not even require a specialized edpf model? (2023-09-25: see `eval_canny_edpf.zip` but it seems as if it doesn't work and the edpf model may be justified)
|
| 71 |
* When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands, and don't attribute them to the control net.
|
| 72 |
+
* When evaluating style we need to be aware of the bias from the image dataset (`laion2b-en-aesthetics65`), which might tend to generating "aesthetic" images, and not actually work "intrisicly better".
|
| 73 |
|
| 74 |
# Versions
|
| 75 |
|
|
|
|
| 129 |
|
| 130 |
**Experiment 4.1 - 2023-09-26 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-90000**
|
| 131 |
|
| 132 |
+
resumed from 45000 steps with left-right flipped images until 90000 steps => results are still not good, 50% is probably also too much for 90k steps. guessmode still doesn't work and tends to produces humans. aborting.
|
| 133 |
+
|
| 134 |
+
** Experiment 5.0 - 2023-09-28 - control-edgedrawing-cv480edpf-fastdup-fp16-checkpoint-45000 **
|
| 135 |
+
|
| 136 |
+
see experiment 3. cleaned original images following the [fastdup introduction](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/cleaning-image-dataset.ipynb) resulting in:
|
| 137 |
+
```
|
| 138 |
+
180210 images in total
|
| 139 |
+
67854 duplicates
|
| 140 |
+
644 outliers
|
| 141 |
+
26 too dark
|
| 142 |
+
321 too bright
|
| 143 |
+
57 blurry
|
| 144 |
+
68621 unique removed (that's 38%!)
|
| 145 |
+
```
|
| 146 |
+
|
| 147 |
+
restarted from 0 with left-right flipped images and `--mixed-precision="no"` to create a master release and converted to fp16 afterwards.
|
| 148 |
+
|
| 149 |
+
** Experiment 6.0 - control-edgedrawing-cv480edpf-rect-fp16-checkpoint-XXXXX **
|
| 150 |
+
|
| 151 |
+
see experiment 5.0.
|
| 152 |
+
* included images with aspect ratio > 2
|
| 153 |
+
* resized images with shortside to 512 which gives us rectangular images instead of 512x512 squares
|
| 154 |
+
* center-cropped images to 512x(n)*64 (to make them SD compatible) and max longside 1024
|
| 155 |
+
* sorted duplicates by `similarity` value from `laion2b-en-aesthetics65` to get the best `text` of all duplicates
|
| 156 |
+
|
| 157 |
+
```
|
| 158 |
+
183410 images in total
|
| 159 |
+
75686 duplicates
|
| 160 |
+
381 outliers
|
| 161 |
+
50 too dark
|
| 162 |
+
436 too bright
|
| 163 |
+
31 blurry
|
| 164 |
+
76288 unique removed (that's 42%!)
|
| 165 |
+
```
|
| 166 |
+
|
| 167 |
+
restarted from 0 and `--mixed-precision="fp16"`.
|
| 168 |
|
| 169 |
# Ideas
|
| 170 |
|
| 171 |
+
* make conceptual captions for laion
|
|
|
|
|
|
|
| 172 |
* integrate edcolor
|
| 173 |
+
* try to fine-tune from canny
|
| 174 |
+
* image dataset with better captions (cc3m)
|
| 175 |
+
* remove images by semantic (use only photos, paintings etc. for edge detection)
|
| 176 |
* re-train with fp32
|
| 177 |
|
| 178 |
# Question and answers
|
control-edgedrawing-cv480edpf-fastdup-fp16-checkpoint-45000.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ea1d8bf1f7e7b5dbb501aeb3c294cf60120c6b56575c16173f43ffacc68f4a8d
|
| 3 |
+
size 722598616
|