Javad Taghia commited on
Commit
702ffd2
·
1 Parent(s): 3d7f7af

model card update

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # dee-unlearning-tiny-sd
2
+
3
+ **Model family:** Stable Diffusion | **Base:** SG161222/Realistic_Vision_V4.0 (Diffusers 0.19.0.dev0)
4
+
5
+ This repository packages the inference components (VAE, UNet, tokenizer, text encoder, scheduler config) that instantiate a `StableDiffusionPipeline` tuned for lightweight experimentation with deep unlearning ideas. All large binaries are stored under Git LFS (`*.bin` and other model artifact extensions as configured in `.gitattributes`).
6
+
7
+ ---
8
+
9
+ ## Model summary
10
+
11
+ - **Architecture:** `StableDiffusionPipeline` with `UNet2DConditionModel`, `CLIPTextModel`, `AutoencoderKL`, and `DPMSolverMultistepScheduler`.
12
+ - **Scheduler:** DPMSolver++ (multistep) configured with `num_train_timesteps=1000`, `steps_offset=1`, and the default `epsilon` prediction type that aligns with the diffusion formulation used in Realistic Vision.
13
+ - **Intended behavior:** Generate photorealistic samples guided by text prompts. The “tiny” name reflects a focus on a compact deployment bundle rather than a new generative architecture.
14
+
15
+ ## Usage
16
+
17
+ 1. Install dependencies (tested with `diffusers==0.19.0.dev0`, `transformers`, `torch`, `accelerate`, `safetensors`).
18
+ 2. Load the pipeline with the provided components.
19
+
20
+ ```python
21
+ from diffusers import StableDiffusionPipeline
22
+ from transformers import CLIPTokenizer, CLIPTextModel
23
+ from diffusers import UNet2DConditionModel, AutoencoderKL, DPMSolverMultistepScheduler
24
+
25
+ pipeline = StableDiffusionPipeline(
26
+ text_encoder=CLIPTextModel.from_pretrained("path/to/text_encoder"),
27
+ tokenizer=CLIPTokenizer.from_pretrained("path/to/tokenizer"),
28
+ unet=UNet2DConditionModel.from_pretrained("path/to/unet"),
29
+ vae=AutoencoderKL.from_pretrained("path/to/vae"),
30
+ scheduler=DPMSolverMultistepScheduler.from_config("path/to/scheduler"),
31
+ )
32
+ pipeline.to("cuda")
33
+ prompt = "A cinematic portrait of a futuristic astronaut exploring a coral reef"
34
+ with torch.autocast("cuda"):
35
+ image = pipeline(prompt, num_inference_steps=25, guidance_scale=7.5).images[0]
36
+ ```
37
+
38
+ Replace each `from_pretrained` call with the relative path inside this repository (e.g., `"text_encoder"`). Exported weights follow the standard Diffusers layout, so you can also load the entire pipeline from disk with `StableDiffusionPipeline.load_from_directory(...)` if you prefer a single root.
39
+
40
+ ## Known limitations
41
+
42
+ - Not evaluated on a public benchmark: quality, bias, and safety metrics are unknown beyond the original Realistic Vision baseline.
43
+ - Outputs inherit the biases of the base dataset, which can include underrepresentation of marginalized groups and the tendency to hallucinate architecture or people.
44
+ - Prompts that contradict physics, are highly abstract, or request disallowed content may fail or produce unpredictable imagery.
45
+ - Fine-tuning past the provided weights may require additional safety mitigations depending on your dataset.
46
+
47
+ ## Opportunities
48
+
49
+ 1. **Research experimentation:** Use this compact bundle to investigate targeted unlearning strategies or dataset pruning without re-downloading massive checkpoints.
50
+ 2. **Edge deployment:** Swap in a smaller scheduler or reduce `num_inference_steps` to explore speed/quality trade-offs for on-device sampling.
51
+ 3. **Controlled generation:** Attach additional conditioning (CLIP embeddings, ControlNet) to the pipeline for downstream applications such as assistive art tools, conditional rendering, or creative assistants.
52
+
53
+ ## Safety considerations
54
+
55
+ - Follow established safety best practices when generating faces, political imagery, or NSFW prompts; the pipeline does not include a safety checker.
56
+ - Monitor outputs for deceptive or fabricated content before deployment in public-facing products.
57
+ - Don’t use the model to impersonate real people, create harmful memes, or automate disinformation campaigns.
58
+
59
+ ## Attribution & licensing
60
+
61
+ This work builds on the `SG161222/Realistic_Vision_V4.0` checkpoints and the Diffusers ecosystem. Verify and comply with the upstream license before redistributing or fine-tuning the weights.