File size: 4,094 Bytes
f896b7d
854e01f
cd5b2d0
 
 
 
 
 
 
 
 
 
6d40be0
2837aa3
 
0033818
6d40be0
6bfe7a0
726f767
 
c47e72e
726f767
c47e72e
 
 
2837aa3
 
 
36e03f8
ef0c7b6
36e03f8
ef0c7b6
36e03f8
ef0c7b6
 
 
 
8d76280
ef0c7b6
2837aa3
8d76280
36e03f8
8d382cc
 
 
 
 
726f767
 
ef0c7b6
 
 
 
8d76280
ef0c7b6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36e03f8
 
 
 
8d76280
36e03f8
726f767
ef0c7b6
726f767
36e03f8
726f767
 
 
 
36e03f8
 
8d76280
36e03f8
8d76280
36e03f8
726f767
8f28078
8d76280
8f28078
 
 
 
 
 
36e03f8
 
8d76280
 
 
36e03f8
726f767
 
36e03f8
 
 
 
 
 
726f767
 
 
36e03f8
 
2837aa3
726f767
 
36e03f8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
---
license: apache-2.0
base_model:
- timm/tf_efficientnetv2_s.in21k_ft_in1k
- Ultralytics/YOLO11
tags:
- comfyui
- object-detection
- face-detection
- face-segmentation
- pytorch
- image-segmentation
---

<div align="center">
<img src="images/header.webp" width="800px" />

Custom-trained models for face detection and segmentation across realistic, anime, and NSFW content.

Made for the **Forbidden Vision** ComfyUI custom nodes

<a href="https://github.com/luxdelux7/ComfyUI-Forbidden-Vision">GitHub Repository</a>  
<a href="https://ko-fi.com/luxdelux" target="_blank">
  <img src="https://ko-fi.com/img/githubbutton_sm.svg" alt="Support me on Ko-fi">
</a>
</div>

---

## 🎯 Why These Models Exist

Traditional face models fail where it matters most for AI art workflows:

| **Problem** | **Why It Matters** |
|-------------|-------------------|
| 🎨 **Domain-locked** | Existing models excel at *either* anime *or* realistic—never both |
| 🔞 **NSFW blindness** | Most models trained only on SFW data break on adult content |
| 👁️‍🗨️ **Detail blindness** | Most models miss anime eyebrows, real eyelashes etc. |
| 🎲 **Generation artifacts** | Standard datasets don't include diffusion model quirks and failures |

**These models solve all 4.**

<div align="center">
<img src="./images/masks.webp" alt="Mask Example" style="border-radius: 6px; box-shadow: 0 0 12px rgba(0,0,0,0.1);">
<p><em>The segmentation model predicts face masks, stylistic eyebrows, eyelashes etc.</em></p>
</div>

---

## 📊 Training Foundation

### The Dataset Difference

Built from **14,000+ manually annotated images** across the domains that actually matter for AI generation:

<table>
<tr>
<td width="50%">

**🎨 Multi-Domain Coverage**
- SDXL, SD1.5, Pony, Illustrious outputs
- Curated Danbooru (anime styles)
- Real photography
- Full NSFW inclusion (no filtering)

</td>
<td width="50%">

**💎 Edge Case Priority**
- ✓ Extreme angles & occlusions
- ✓ Failed/broken generations
- ✓ Low-quality artifacts
- ✓ Unusual expressions & poses
- ✓ Everything other models ignore

</td>
</tr>
</table>

### What This Means For You

```
Traditional models: Trained on clean celebrity faces

    Fail on real workflows

These models: Trained on what you actually generate

    Work when you need them
```

**One model family. Every domain. Zero compromises.**

## Model Details

### Face Detection (YOLOv11-Small)

**Purpose:** Primary face detection with high recall

**Training Approach:**
- After every training run, I ran the model on a new mixed dataset, hardmining failures and improving the dataset until an acceptable performance was reached
- Trained at 640px resolution (inference should use same resolution)

**Why YOLOv11-Small instead of nano?**  
More reliable detection across mixed realistic/anime domains with acceptable speed tradeoff.

---


### Segmentation (EfficientNet-v2)

**Purpose:** Precise face mask generation

**Training Approach:**
- Dataset prepared using the Forbidden Vision YOLO model at 512px resolution
- Iterative hardmine training in multiple phases:
  - Train on the initial 700 samples
  - Evaluate on remaining images to find failure cases
  - Correct failed masks and add them to the dataset
  - Retrain with the expanded dataset
  - Repeat until failure cases drop to near-zero  
    (final dataset: 4k+ images) 

**Features:**
- Detects and includes facial features other models ignore, like protruding anime eybrows, realistic eyelashes sticking out of the face etc.
- Glasses and similar are treated as part of the face, even if sticking outside the face shape
- NSFW friendly across both anime, realistic and 3d domains

---

## Usage

These models are automatically downloaded and used by the **Fixer** node in ComfyUI Forbidden Vision.

## License

Apache 2.0

---

## Contact

- GitHub: [ComfyUI-Forbidden-Vision](https://github.com/luxdelux7/ComfyUI-Forbidden-Vision)
- Issues: [GitHub Issues](https://github.com/luxdelux7/ComfyUI-Forbidden-Vision/issues)
- Support: [Ko-fi](https://ko-fi.com/luxdelux)