redhat-dog-sd3 / README.md
cfchase's picture
Update README.md
680e889 verified
---
license: other
base_model: stabilityai/stable-diffusion-3.5-medium
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- dreambooth
- redhat
- corporate-branding
- fine-tuned
library_name: diffusers
pipeline_tag: text-to-image
---
# RedHat Dog SD3 - Fine-tuned Stable Diffusion 3.5 Model
## Model Description
This is a fine-tuned version of [Stable Diffusion 3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) trained using the Dreambooth technique to generate images of a specific Red Hat branded dog character ("rhteddy").
## Model Details
- **Base Model**: stabilityai/stable-diffusion-3.5-medium
- **Fine-tuning Method**: Dreambooth
- **Training Data**: 5-10 images of Red Hat dog character
- **Training Steps**: 800 steps
- **Resolution**: 512x512 pixels
- **Hardware**: NVIDIA L40S GPU (40GB memory)
## Intended Use
This model is designed for:
- Generating images of the Red Hat dog character in various contexts
- Educational demonstrations of Dreambooth fine-tuning
- Corporate branding and marketing content creation
- Research into personalized diffusion models
## Example
```python
import torch
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained(
"cfchase/redhat-dog-sd3",
torch_dtype=torch.bfloat16
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipeline.to(device)
# Generate an image
image = pipeline("photo of a rhteddy dog in a park").images[0]
image.save("redhat_dog_park.png")
```
### Recommended Prompts
The model works best with prompts that include the trigger phrase `rhteddy dog`:
- `"photo of a rhteddy dog"`
- `"rhteddy dog sitting in an office"`
- `"rhteddy dog wearing a Red Hat"`
- `"rhteddy dog in a technology conference"`
## Training Details
### Training Configuration
- **Instance Prompt**: "photo of a rhteddy dog"
- **Class Prompt**: "a photo of dog"
- **Learning Rate**: 5e-6
- **Batch Size**: 1
- **Gradient Accumulation Steps**: 2
- **Optimizer**: 8-bit Adam
- **Scheduler**: Constant
- **Prior Preservation**: Enabled with 200 class images
### Training Environment
- **Platform**: Red Hat OpenShift AI (RHOAI)
- **Framework**: Hugging Face Diffusers
- **Acceleration**: xFormers, gradient checkpointing
## Model Architecture
This model inherits the architecture of Stable Diffusion 3.5 Medium:
- **Transformer**: SD3Transformer2DModel
- **VAE**: AutoencoderKL
- **Text Encoders**:
- 2x CLIPTextModelWithProjection
- 1x T5EncoderModel
- **Scheduler**: FlowMatchEulerDiscreteScheduler
## Limitations and Bias
- The model is specifically trained on Red Hat branded imagery and may not generalize well to other contexts
- Training data was limited to a small dataset, which may result in overfitting
- The model inherits any biases present in the base Stable Diffusion 3.5 model
- Performance is optimized for the specific "rhteddy dog" concept and may struggle with significant variations
## Training Data
The training data consists of approximately 5-10 high-quality images of the Red Hat dog character, featuring:
- Various poses and angles
- Consistent visual style and branding
- Professional photography quality
- Clear subject focus
## Technical Specifications
- **Model Size**: ~47GB (full precision weights)
- **Inference Requirements**:
- GPU with 8GB+ VRAM recommended
- CUDA-compatible device
- Python 3.8+
- PyTorch 2.0+
- Diffusers library
## License
This model is based on Stable Diffusion 3.5 Medium and is subject to the same licensing terms. Please refer to the [original model license](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) for details.
## Contact
For questions about this model or the training process, please refer to the [Red Hat OpenShift AI documentation](https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed) or the associated training notebooks.