|
|
--- |
|
|
license: openrail++ |
|
|
base_model: runwayml/stable-diffusion-v1-5 |
|
|
tags: |
|
|
- stable-diffusion |
|
|
- stable-diffusion-diffusers |
|
|
- text-to-image |
|
|
- diffusers |
|
|
- lora |
|
|
- lcm |
|
|
- latent-consistency-model |
|
|
datasets: |
|
|
- Mercity/laion-subset |
|
|
inference: true |
|
|
widget: |
|
|
- text: "a futuristic cyberpunk city at night with neon lights and rain reflections" |
|
|
parameters: |
|
|
num_inference_steps: 6 |
|
|
guidance_scale: 1.0 |
|
|
- text: "a portrait of a cat wearing a detective hat, film noir style" |
|
|
parameters: |
|
|
num_inference_steps: 6 |
|
|
guidance_scale: 1.0 |
|
|
- text: "a majestic lion standing on a rock, overlooking the african savannah at sunset" |
|
|
parameters: |
|
|
num_inference_steps: 6 |
|
|
guidance_scale: 1.0 |
|
|
--- |
|
|
|
|
|
# LCM-LoRA SD1.5 - Checkpoint 1600 |
|
|
|
|
|
**Author:** Juhi Singh | [HuggingFace](https://huggingface.co/juhirats) |
|
|
|
|
|
## Final Training - Mature |
|
|
|
|
|
<div align="center"> |
|
|
<img src="https://huggingface.co/Mercity/lcm-lora-sd1.5-1600/resolve/main/comparison_grid.png" alt="Checkpoint 1600 Comparison Grid"> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## π Part of Checkpoint Series |
|
|
|
|
|
This is **Checkpoint 1600** in our LCM-LoRA training series. Each checkpoint has different characteristics: |
|
|
|
|
|
[Checkpoint 400](https://huggingface.co/Mercity/lcm-lora-sd1.5-400) β’ [Checkpoint 800](https://huggingface.co/Mercity/lcm-lora-sd1.5-800) β’ [Checkpoint 1200](https://huggingface.co/Mercity/lcm-lora-sd1.5-1200) β’ **Checkpoint 1600** (current) |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This checkpoint represents training at **1600 steps** in our LCM-LoRA progression for Stable Diffusion v1.5. |
|
|
|
|
|
**Characteristics:** |
|
|
- Final training checkpoint with mature, consistent outputs. Well-balanced and reliable across all prompts. |
|
|
- **Best for:** Most training, consistent results, production use |
|
|
- **Quality:** Excellent consistency, balanced outputs, reliable |
|
|
|
|
|
**Key Features:** |
|
|
- β‘ **10x Faster**: Generate images in 4-6 steps vs 50 steps |
|
|
- π― **LoRA Adapter**: Only ~100MB, works with any SD1.5 model |
|
|
- π§ **Easy Integration**: Drop-in replacement using diffusers |
|
|
- π **Proven Quality**: See comparison grid above |
|
|
|
|
|
--- |
|
|
|
|
|
## Checkpoint Comparison |
|
|
|
|
|
This checkpoint is part of a training series. Compare with other checkpoints: |
|
|
|
|
|
| Steps | Model | Characteristics | |
|
|
|-------|-------|-----------------| |
|
|
| 400 | [lcm-lora-sd1.5-400](Mercity/lcm-lora-sd1.5-400) | Early training checkpoint showing foundational LCM capabilities. Provides decent... | |
|
|
| 800 | [lcm-lora-sd1.5-800](Mercity/lcm-lora-sd1.5-800) | Mid-training checkpoint with vibrant, artistic outputs. Strong visual impact wit... | |
|
|
| 1200 | [lcm-lora-sd1.5-1200](Mercity/lcm-lora-sd1.5-1200) | Higher training with more refined outputs. Some prompts may show signs of overfi... | |
|
|
| **1600** | [lcm-lora-sd1.5-1600](Mercity/lcm-lora-sd1.5-1600) | Final training checkpoint with mature, consistent outputs. Well-balanced and rel... **β This checkpoint** | |
|
|
|
|
|
--- |
|
|
|
|
|
## Performance Metrics Across Series |
|
|
|
|
|
Compare training progression and characteristics: |
|
|
|
|
|
| Steps | Model Link | Style | Speed (RTX 3090) | Best For | |
|
|
|-------|------------|-------|------------------|----------| |
|
|
| 400 | [lcm-lora-sd1.5-400](https://huggingface.co/Mercity/lcm-lora-sd1.5-400) | Soft, baseline | 2-3s @ 6 steps | Fast experimentation, understanding earl... | |
|
|
| 800 | [lcm-lora-sd1.5-800](https://huggingface.co/Mercity/lcm-lora-sd1.5-800) | Vibrant, saturated | 2-3s @ 6 steps | Artistic applications, vibrant aesthetic... | |
|
|
| 1200 | [lcm-lora-sd1.5-1200](https://huggingface.co/Mercity/lcm-lora-sd1.5-1200) | Balanced, natural | 2-3s @ 6 steps | Balanced colors, natural tones, specific... | |
|
|
| **1600** | [lcm-lora-sd1.5-1600](https://huggingface.co/Mercity/lcm-lora-sd1.5-1600) | Mature, consistent | 2-3s @ 6 steps | Most training, consistent results, produ... **β Current** | |
|
|
|
|
|
--- |
|
|
|
|
|
## Visual Comparison Across All Checkpoints |
|
|
|
|
|
See how outputs evolve across the training series. Each grid shows: Baseline SD1.5 (50 steps) vs LCM-LoRA at 2, 4, and 6 steps. |
|
|
|
|
|
### [Checkpoint 400](https://huggingface.co/Mercity/lcm-lora-sd1.5-400) |
|
|
|
|
|
 |
|
|
|
|
|
### [Checkpoint 800](https://huggingface.co/Mercity/lcm-lora-sd1.5-800) |
|
|
|
|
|
 |
|
|
|
|
|
### [Checkpoint 1200](https://huggingface.co/Mercity/lcm-lora-sd1.5-1200) |
|
|
|
|
|
 |
|
|
|
|
|
### Checkpoint 1600 (This Checkpoint) |
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## Sample Outputs |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install --upgrade diffusers transformers accelerate |
|
|
``` |
|
|
|
|
|
### Basic Usage |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from diffusers import StableDiffusionPipeline, LCMScheduler |
|
|
|
|
|
# Load base SD1.5 model |
|
|
pipe = StableDiffusionPipeline.from_pretrained( |
|
|
"runwayml/stable-diffusion-v1-5", |
|
|
torch_dtype=torch.float16 |
|
|
) |
|
|
pipe.to("cuda") |
|
|
|
|
|
# Load this LCM-LoRA checkpoint |
|
|
pipe.load_lora_weights("Mercity/lcm-lora-sd1.5-1600") |
|
|
|
|
|
# IMPORTANT: Use LCM scheduler |
|
|
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) |
|
|
|
|
|
# Generate with just 4-6 steps! |
|
|
prompt = "a portrait of a cat wearing a detective hat, film noir style" |
|
|
image = pipe( |
|
|
prompt=prompt, |
|
|
num_inference_steps=6, |
|
|
guidance_scale=1.0 |
|
|
).images[0] |
|
|
|
|
|
image.save("output.png") |
|
|
``` |
|
|
|
|
|
### Recommended Settings |
|
|
|
|
|
```python |
|
|
num_inference_steps = 6 # Optimal for this checkpoint |
|
|
guidance_scale = 1.0 # Required for LCM |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Details |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| **Checkpoint** | 1600 | |
|
|
| **Base Model** | runwayml/stable-diffusion-v1-5 | |
|
|
| **Training Steps** | 1600 | |
|
|
| **Dataset** | Mercity/laion-subset | |
|
|
| **LoRA Rank** | 96 | |
|
|
| **LoRA Alpha** | 96 | |
|
|
| **Resolution** | 512Γ512 | |
|
|
| **Batch Size** | 64 | |
|
|
| **Learning Rate** | 1e-4 | |
|
|
| **Optimizer** | AdamW | |
|
|
|
|
|
--- |
|
|
|
|
|
## Sample Outputs |
|
|
|
|
|
The comparison grid above shows outputs from this checkpoint at 2, 4, and 6 inference steps, compared to standard SD1.5 at 50 steps. |
|
|
|
|
|
**Prompts included:** |
|
|
1. Futuristic cyberpunk city with neon lights and rain reflections |
|
|
2. Portrait of a cat wearing a detective hat, film noir style |
|
|
3. Cozy coffee shop interior with warm lighting and plants |
|
|
4. Ancient Japanese temple in misty mountain landscape at sunrise |
|
|
5. Majestic lion on rock overlooking African savannah at sunset |
|
|
6. Magical forest with glowing blue mushrooms and fireflies |
|
|
7. Vintage red steam locomotive crossing stone viaduct over canyon |
|
|
|
|
|
<details> |
|
|
<summary>View individual samples</summary> |
|
|
|
|
|
All sample images for this checkpoint are available in the `samples/` directory. |
|
|
|
|
|
</details> |
|
|
|
|
|
### Out-of-Distribution (OOD) Validation Images |
|
|
|
|
|
To test generalization beyond the training distribution, we generated images for 5 OOD prompts that are deliberately different from training prompts: |
|
|
|
|
|
1. **π Underwater Scene** |
|
|
- *"underwater coral reef with colorful fish and sea anemones, crystal clear water, natural sunlight filtering through"* |
|
|
- Tests: Water effects, marine life, underwater lighting (not in training) |
|
|
|
|
|
2. **π Space/Astronomy** |
|
|
- *"astronaut floating in space with earth in background, stars and galaxies, cinematic lighting, 4k"* |
|
|
- Tests: Zero gravity, cosmic environment, space rendering (not in training) |
|
|
|
|
|
3. **π° Food Photography** |
|
|
- *"gourmet chocolate cake with berries on elegant plate, professional food photography, soft studio lighting"* |
|
|
- Tests: Food textures, studio lighting, product photography (not in training) |
|
|
|
|
|
4. **π΄ Human Portrait** |
|
|
- *"close-up portrait of elderly man with weathered face and kind eyes, dramatic side lighting, black and white"* |
|
|
- Tests: Human facial features, skin texture, B&W conversion (training had cat portrait, not human closeup) |
|
|
|
|
|
5. **π¨ Abstract Art** |
|
|
- *"abstract watercolor painting with flowing colors, pink and blue gradient, artistic ethereal style"* |
|
|
- Tests: Non-representational art, color blending (training was all representational) |
|
|
|
|
|
**Why OOD Validation?** These prompts test whether the model truly learned general concepts rather than just memorizing training prompts. Good OOD performance indicates robust generalization. |
|
|
|
|
|
All validation images can be found in the `validation/` directory. See [`validation/prompts.txt`](validation/prompts.txt) for the complete list of prompts used. |
|
|
|
|
|
--- |
|
|
|
|
|
## Performance |
|
|
|
|
|
### Speed Comparison |
|
|
|
|
|
| Method | Steps | Time (A100) | Time (RTX 3090) | |
|
|
|--------|-------|-------------|-----------------| |
|
|
| SD1.5 Default | 50 | ~15s | ~25s | |
|
|
| SD1.5 Fast | 25 | ~8s | ~13s | |
|
|
| **LCM-LoRA (this)** | **6** | **~2s** | **~3s** | |
|
|
| **LCM-LoRA (this)** | **4** | **~1.5s** | **~2s** | |
|
|
|
|
|
### Quality Progression |
|
|
|
|
|
- **2 steps**: Fast, captures main composition |
|
|
- **4 steps**: Good balance, suitable for most cases |
|
|
- **6 steps**: Best quality (recommended) |
|
|
- **8 steps**: Slightly better, diminishing returns |
|
|
|
|
|
--- |
|
|
|
|
|
## Series Information |
|
|
|
|
|
### Training Progression |
|
|
|
|
|
This checkpoint is part of a training series showing LCM-LoRA evolution: |
|
|
|
|
|
``` |
|
|
Training Steps: 400 βββ 800 βββ 1200 βββ 1600 |
|
|
β β β β |
|
|
Quality: Baseline Peak Refined Mature |
|
|
Style: Soft Vibrant Balanced Stable |
|
|
``` |
|
|
|
|
|
### Download All Checkpoints |
|
|
|
|
|
```bash |
|
|
# Download all checkpoints for comparison |
|
|
huggingface-cli download Mercity/lcm-lora-sd1.5-400 |
|
|
huggingface-cli download Mercity/lcm-lora-sd1.5-800 |
|
|
huggingface-cli download Mercity/lcm-lora-sd1.5-1200 |
|
|
huggingface-cli download Mercity/lcm-lora-sd1.5-1600 |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage Tips |
|
|
|
|
|
### For Best Results |
|
|
|
|
|
1. **Always use `LCMScheduler`** - Required for LCM |
|
|
2. **Set `guidance_scale=1.0`** - CFG doesn't work with LCM |
|
|
3. **Use 4-8 steps** - Optimal range is 6 steps |
|
|
4. **Same prompts as SD1.5** - No special prompting needed |
|
|
|
|
|
### Checkpoint Selection |
|
|
|
|
|
- **Testing/comparison?** Try different checkpoints to find your preference |
|
|
- **Different characteristics:** Each checkpoint has unique qualities |
|
|
- **Training progression:** See how the model evolves with more training |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained on 512Γ512 resolution (best results at this size) |
|
|
- Requires `LCMScheduler` - other schedulers won't work |
|
|
- `guidance_scale` must be 1.0 (CFG incompatible with LCM) |
|
|
- Each checkpoint has slightly different characteristics |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@article{luo2023latent, |
|
|
title={Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference}, |
|
|
author={Luo, Simian and Tan, Yiqin and Huang, Longbo and Li, Jian and Zhao, Hang}, |
|
|
journal={arXiv preprint arXiv:2310.04378}, |
|
|
year={2023} |
|
|
} |
|
|
|
|
|
@article{hu2021lora, |
|
|
title={LoRA: Low-Rank Adaptation of Large Language Models}, |
|
|
author={Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu}, |
|
|
journal={arXiv preprint arXiv:2106.09685}, |
|
|
year={2021} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the same license as Stable Diffusion v1.5: |
|
|
- **CreativeML Open RAIL-M License** |
|
|
- Commercial use allowed with restrictions |
|
|
- See: https://huggingface.co/spaces/CompVis/stable-diffusion-license |
|
|
|
|
|
--- |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- **Base Model**: [Stable Diffusion v1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5) |
|
|
- **LCM Method**: [Latent Consistency Models](https://arxiv.org/abs/2310.04378) |
|
|
- **LoRA Method**: [Low-Rank Adaptation](https://arxiv.org/abs/2106.09685) |
|
|
- **Training Framework**: [Diffusers](https://github.com/huggingface/diffusers) |
|
|
|
|
|
--- |
|
|
|
|
|
## More Information |
|
|
|
|
|
- **Other checkpoints in series**: [Checkpoint 400](https://huggingface.co/Mercity/lcm-lora-sd1.5-400) β’ [Checkpoint 800](https://huggingface.co/Mercity/lcm-lora-sd1.5-800) β’ [Checkpoint 1200](https://huggingface.co/Mercity/lcm-lora-sd1.5-1200) β’ **Checkpoint 1600** (current) |
|
|
- **Discussions**: [Model discussions](https://huggingface.co/Mercity/lcm-lora-sd1.5-1600/discussions) |
|
|
- **Report issues**: [Community tab](https://huggingface.co/Mercity/lcm-lora-sd1.5-1600/discussions) |
|
|
|