AI & ML interests
OmnIA© is an advanced text-to-image generative model engineered to provide creators with unparalleled versatility and fine-grained control over image synthesis across a diverse range of artistic styles. Developed as a distinct foundational model, OmnIA© is incorporating a novel approach to style conditioning, robust prompt adherence, and high-fidelity image output. Furthermore, OmnIA© has been specifically designed to serve as a superior base model for further training by other trainers, offering a more versatile and robust foundation due to its deeply incorporated multiple styles. This document provides a comprehensive overview of OmnIA©'s architecture, its unique multi-style generation capabilities, detailed training methodology, extensive usage guidelines with illustrative examples, and a discussion of its current limitations and future development roadmap. The model is trained on a meticulously curated, high-resolution (2K resolution) dataset of 10,500 images, enabling precise generation of realistic, semi-realistic, anime, western comics, 2.5D, and specific animation-inspired styles without reliance on conventional tags.
Author: Samuele Bonzio (Samael1976) Date: 05/06/2025 Model Version: OmnIA© v1.0 Base Architecture: UNet License: CreativeML Open RAIL++-M
Abstract: OmnIA© is an advanced text-to-image generative model engineered to provide creators with unparalleled versatility and fine-grained control over image synthesis across a diverse range of artistic styles. Developed as a distinct foundational model, OmnIA© is incorporating a novel approach to style conditioning, robust prompt adherence, and high-fidelity image output. Furthermore, OmnIA© has been specifically designed to serve as a superior base model for further training by other trainers, offering a more versatile and robust foundation due to its deeply incorporated multiple styles. This document provides a comprehensive overview of OmnIA©'s architecture, its unique multi-style generation capabilities, detailed training methodology, extensive usage guidelines with illustrative examples, and a discussion of its current limitations and future development roadmap. The model is trained on a meticulously curated, high-resolution (2K resolution) dataset of 10,500 images, enabling precise generation of realistic, semi-realistic, anime, western comics, 2.5D, and specific animation-inspired styles without reliance on conventional tags.
- Introduction • Background and Motivation The rapid evolution of text-to-image generative models has unlocked significant creative potential. However, many existing solutions are often constrained to specific stylistic domains or require intricate prompting techniques (e.g., "score tags") to achieve desired outputs. This can limit artistic expression and introduce a steep learning curve for users. OmnIA© was conceived to address these limitations, aiming to create a self-sufficient, highly versatile foundational model that simplifies multi-style generation while ensuring high fidelity and adherence to user prompts. The primary motivation was to develop a model that offers both broad stylistic coverage and intuitive control, serving as a superior base for both direct use and further specialized training.
• OmnIA©: A Foundational Model While OmnIA© leverages has been intentionally developed to establishing itself as a distinct, standalone foundational model. This distinction is crucial, as OmnIA©'s training regimen and dataset are designed to produce more diversified stylistic results and offer a more adaptable starting point compared to its predecessors or other contemporaneous models.
• Document Scope This document details the technical specifications, features, and operational guidelines for the OmnIA© model. It is intended for users seeking to understand its capabilities, optimize their generation workflows, and for researchers interested in its underlying methodology. 2. Core Features and Stylistic Capabilities OmnIA© is distinguished by a suite of features designed for flexible and high-quality image generation:
2.1.0. Advanced Multi-Style Conditioning OmnIA© introduces a sophisticated system for style control, achievable through specific keywords or inferred directly from descriptive prompts.
2.1.1. Explicit Style Keywords: These keywords allow for direct and precise conditioning of the output style:
• The keyword 4n1m3rg3 targets Realistic and Semi-Realistic Styles. Use of this keyword guides OmnIA© to produce outputs with photographic qualities, plausible lighting, detailed textures, and accurate anatomy. Semi-realistic outputs retain these qualities but may incorporate a slightly stylized finish, akin to high-quality digital paintings. For instance, a prompt including 4n1m3rg3 might generate a realistic portrait of an elderly scholar, with every wrinkle and fabric texture rendered convincingly.
• The keyword 4n1v3rs3 induces a 2.5D Illustration Style. This typically results in images with a strong sense of depth and dimensionality, often resembling concept art or stylized game graphics, but retaining an illustrative, painterly or digitally rendered aesthetic. An example output might be a dynamic concept art piece of a futuristic city, where buildings have tangible form yet the overall feel is artistic and illustrative.
• The keywords 4n1t00n is dedicated to Anime Style Generation. Using these keywords will produce visuals characteristic of Japanese animation, including cel-shading, distinct and expressive facial features, dynamic compositions, and vibrant color palettes. A typical generation could be a vibrant anime scene depicting a magical girl in mid-transformation, complete with sparkling effects and speed lines.
• The keywords 4n1m1cs and/or Western Comics generate images in American Comic Book Style. This can range from classic superhero aesthetics with bold inks, dynamic shading, and Ben Day dots, to more modern graphic novel looks with sophisticated coloring and composition. An example might be a gritty noir comic panel of a detective illuminated by a single streetlamp, rendered with sharp contrasts.
• Animation-Inspired Styles It's also possible to use broader tags to a limited extent, but the results may not always be perfect. (e.g., using terms like Pixar, Disney): Keywords referencing specific animation studios or highly influential franchises (e.g., Dragonball, One Piece) can be used to evoke their characteristic visual language. OmnIA© has been trained to recognize and replicate key elements of these styles, such as character proportions, color palettes, and typical scene compositions. A prompt with Pixar might result in a heartwarming, round-faced character with expressive eyes and soft texturing.
2.1.2. Inherent Style Inference: Crucially, the use of style keywords is not mandatory. OmnIA© is designed to effectively infer the desired artistic style from the textual description alone if no explicit keywords are provided, offering a more natural prompting experience. For example, a prompt like "a cyberpunk cityscape at night, raining, neon signs" will likely produce a suitable image without needing an explicit style keyword. 2.1.3. Style Weighting and Blending: Keywords can be assigned numerical weights (for example, using the syntax like keyword:weight, such as 4n1t00n:1.2 combined with 4n1m3rg3:0.5) to modulate their influence, allowing for nuanced blending of different artistic styles within a single generation. This combinatorial approach vastly expands the creative possibilities. An image might feature a character with clear anime linework and facial structure due to a higher weight on 4n1t00n, but with semi-realistic texturing on their clothing and armor influenced by a lower weight on 4n1m3rg3.
2.1.4. Style Negation: Style keywords can be effectively utilized within negative prompts to steer the generation away from particular styles, further refining the output. If a user desires a realistic portrait but finds hints of anime style appearing, adding 4n1t00n to the negative prompt can help suppress those elements.
2.2.0. High Prompt Fidelity and Coherence OmnIA© is engineered for precise adherence to textual prompts. The training process (detailed in Section 5) prioritizes the accurate interpretation of complex descriptions, resulting in images that closely match user intent in terms of subject matter, attributes, and composition. Image coherence is also a key focus, minimizing artifacts and ensuring internal consistency within the generated image.
2.3.0. Elimination of Mandatory "Score Tags" Unlike many models that rely on explicit "score" or “many tags” (e.g., score_9, score_8_up) to achieve high-quality output, OmnIA© is designed to produce high-quality images based on descriptive prompting alone. This simplifies the user experience and encourages more natural language interaction.
2.4.0. Robust Foundational Capabilities OmnIA© serves as a highly adaptable foundational model, making it an excellent starting point for further specialized fine-tuning on more specific datasets or artistic styles, should users wish to develop derivative models.
3. Generation Showcase and Extended Prompting Examples(1) This section provides a broader range of examples demonstrating OmnIA©'s capabilities, including prompt structures and key generation parameters, with same full meta data, by changing only the tags
3.1.0. Demonstrating Realistic Style (using 4n1m3rg3) Example 3.1.3: Stylized Character Art • Image Description: A captivating portrait of a young woman with striking red hair, smiling gently while posing outdoors. Her features are illuminated by natural light, highlighting a serene and engaging expression. • Positive Prompt: 4n1m3rg3, portrait, girl, ginger, cute, seductive, innocent, light smile:0.3, plump lips, slender body, red sweater falling off her shoulder, cleavage, extremely sexy, long ginger hair, floating hair, small breasts, looking at the viewer, beach background, depth of field, dynamic angle, photo realistic:1.4, realistic skin:1.4, fashion photography, sharp, analog film grain, hyperdetailed:1.15 • Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, • Seed:433365126, Sampler:DPM++ 2M, Schedule:Karras Steps:50, CFG Scale: 5.5, Resolution: 768x1344, aDetailer: Active
3.1.1. Demonstrating 2.5D Style (using 4n1v3rs3) Example 3.1.3: Stylized Character Art • Image Description: A captivating portrait of a young woman with striking red hair, smiling gently while posing outdoors. Her features are illuminated by natural light, highlighting a serene and engaging expression. • Positive Prompt: 4n1v3rs3, portrait, girl, ginger, cute, seductive, innocent, light smile:0.3, plump lips, slender body, red sweater falling off her shoulder, cleavage, extremely sexy, long ginger hair, floating hair, small breasts, looking at the viewer, beach background, depth of field, dynamic angle, photo realistic:1.4, realistic skin:1.4, fashion photography, sharp, analog film grain, hyperdetailed:1.15 • Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, • Seed:433365126, Sampler:DPM++ 2M, Schedule:Karras Steps:50, CFG Scale: 5.5, Resolution: 768x1344, aDetailer: Active
3.1.2. Demonstrating Anime Style (using 4n1t00n) Example 3.1.3: Stylized Character Art • Image Description: A captivating portrait of a young woman with striking red hair, smiling gently while posing outdoors. Her features are illuminated by natural light, highlighting a serene and engaging expression. • Positive Prompt: 4n1m3rg3, portrait, girl, ginger, cute, seductive, innocent, light smile:0.3, plump lips, slender body, red sweater falling off her shoulder, cleavage, extremely sexy, long ginger hair, floating hair, small breasts, looking at the viewer, beach background, depth of field, dynamic angle, photo realistic:1.4, realistic skin:1.4, fashion photography, sharp, analog film grain, hyperdetailed:1.15 • Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, • Seed:433365126, Sampler:DPM++ 2M, Schedule:Karras Steps:50, CFG Scale: 5.5, Resolution: 768x1344, aDetailer: Active 3.1.3. Demonstrating Western/American Comic book Style (using 4n1m1cs) Example 3.1.3: Stylized Character Art • Image Description: A captivating portrait of a young woman with striking red hair, smiling gently while posing outdoors. Her features are illuminated by natural light, highlighting a serene and engaging expression. • Positive Prompt: 4n1m3rg3, portrait, girl, ginger, cute, seductive, innocent, light smile:0.3, plump lips, slender body, red sweater falling off her shoulder, cleavage, extremely sexy, long ginger hair, floating hair, small breasts, looking at the viewer, beach background, depth of field, dynamic angle, photo realistic:1.4, realistic skin:1.4, fashion photography, sharp, analog film grain, hyperdetailed:1.15 • Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, • Values: Seed:433365126, Sampler:DPM++ 2M, Schedule:Karras Steps:50, CFG Scale: 5.5, Resolution: 768x1344, aDetailer: Active
3.1.4. Demonstrating Normal Style (without any Tags) Example 3.1.3: Stylized Character Art • Image Description: A captivating portrait of a young woman with striking red hair, smiling gently while posing outdoors. Her features are illuminated by natural light, highlighting a serene and engaging expression. • Positive Prompt: 4n1m3rg3, portrait, girl, ginger, cute, seductive, innocent, light smile:0.3, plump lips, slender body, red sweater falling off her shoulder, cleavage, extremely sexy, long ginger hair, floating hair, small breasts, looking at the viewer, beach background, depth of field, dynamic angle, photo realistic:1.4, realistic skin:1.4, fashion photography, sharp, analog film grain, hyperdetailed:1.15 • Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, • Seed:433365126, Sampler:DPM++ 2M, Schedule:Karras Steps:50, CFG Scale: 5.5, Resolution: 768x1344, aDetailer: Active
(3.1.0) - 4n1m3rg3 (3.1.1) - 4n1v3rs3 (3.1.2) - 4n1t00n (3.1.3) - 4n1m1cs (3.1.4) - no-tags Realistic 2.5D Anime Style Western Comics 2.5D
3.2 Sampler Variations: A Generation Showcase with Prompt Example
3.1.0. Demonstrating Realistic Style (using 4n1m3rg3) Example 3.1.3: Stylized Character Art • Image Description: A portrait features a young woman with long dark hair and bangs, wearing a gothic black dress with lacing details and a choker necklace. She has light-colored eyes and a subtle smile, with a blurred background showing a fireplace.. • Positive Prompt: 4n1m3rg3, Raw Photo, Portrait, girl, gothic, cute, seductive, innocent, light smile:0.3, plump lips, slender body, pale skin, long straight black hair, bangs, black goth dress, medieval theme, fire in the fireplace background, depth of field, dynamic angle, photo realistic:1.4, realistic skin:1.4, fashion photography, sharp, analog film grain, hyperdetailed:1.15 • Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, • Seed: 445418580, Schedule:Karras Steps:50, CFG Scale: 5.5, Resolution: 768x1344, aDetailer: Active
DPM++ 2M DPM++ 2M SDE DPM++ 3M SDE Euler a Heun SAMPLER 01 – ORIGINAL GRID DOWNLOAD
DDIM UniPC LCM Euler Max Kohaku_LoNyu_Y. SAMPLER 02 – ORIGINAL GRID DOWNLOAD
4. Comprehensive Usage Guidelines and Parameter Optimization
To achieve optimal results with OmnIA©, users should consider the following guidelines for prompting and parameter tuning.
4.1.0. Crafting Effective Positive Prompts Specificity is Key: Provide clear and detailed descriptions of the desired subject, action, environment, artistic medium/style (even without keywords), lighting, color palette, and composition. Structure and Order: While OmnIA© is robust, subject-first prompting often yields good results. Consider a structure such as: Style Descriptors/Keywords, Subject, Action/Pose, Detailed Attributes/Clothing, Environment/Background, Artistic Medium Descriptors, Color and Lighting Descriptors, Compositional Elements Leverage Style Keywords: Utilize the keywords detailed in Section 2.1 (e.g., 4n1m3rg3, 4n1t00n) for explicit style control and blending. Descriptive Adjectives: Employ a rich vocabulary of adjectives to define mood, texture, quality (e.g., ethereal, gritty, polished, vibrant, muted, intricate, grand, serene). General Quality Enhancers (Optional): While OmnIA© does not require score tags, terms like masterpiece, best quality, highly detailed, intricate details, but, terms like, cinematic lighting, professional artwork, 8k, uhd, sharp focus, depth of field, can sometimes further refine outputs by reinforcing the model's understanding of desired quality.
4.2.0. Utilizing Negative Prompts for Refinement Targeted Exclusion: Use negative prompts to eliminate undesired elements, styles, or artifacts. Standard Negative Sets: A common baseline for general use could include: Low Quality / Artifacts: worst quality, low quality, normal quality, lowres, blurry, jpeg artifacts, compression artifacts, noise, grain, pixelated, watermark, text, signature, username, logo, banner. Deformities / Anatomical Issues: deformed, disfigured, bad anatomy, malformed limbs, extra limbs, missing limbs, fused fingers, too many fingers, poorly drawn hands, poorly drawn face, mutated hands, distorted face, ugly, grotesque, extra digits, cloned face. Undesired Content (Contextual): animal ears, hair ornaments, morbid, mutilated, NSFW (if aiming for SFW results). Stylistic Exclusions: Use style keywords here, such as anime, cartoon, 3d render, photograph, painting if trying to avoid those specific styles. For example, if aiming for a photograph and getting illustrative results, add illustration, painting, drawing to the negative. Compositional Issues: cropped, out of frame, duplicate, tiling, poor composition, cluttered, boring, monotonous. Iterative Refinement: Adjust negative prompts based on initial generation results. If hands are consistently poor, add more specific negative terms related to hands and fingers.
4.3.0. Optimizing Generation Parameters • CFG Scale (Classifier-Free Guidance): Recommended Range: 2.0 to 9.0. Lower values (2.0-5.0): To get more realistic results and more creative freedom, potentially less prompt adherence; useful for abstract or experimental results. Mid-range (5.0-7.0): Generally the optimal balance of prompt adherence and image quality for OmnIA©. Most users will find this range provides excellent results. Higher values (7.0-9.0): Stricter prompt adherence. Can be useful for precise outputs but may lead to over-saturation, overly sharp details, or minor artifacts if pushed too high without careful prompting.
• Sampling Steps: Recommended Range: 25 to 60 steps. Baseline: 30-40 steps is a good starting point for most generations with common samplers like DPM++ 2M Karras. Increased Detail: For highly complex prompts or finer detail, increasing to 40-50 steps may be beneficial. Some ancestral samplers like Euler a might achieve good results with fewer steps (20-30). Beyond 60 steps, returns often diminish significantly with most samplers.
• Resolution and Optimal Aspect Ratios: Landscape/Portrait: 1344x768 pixels (or 768x1344). Square: 1280x1280 pixels. Adherence to these resolutions ensures the model performs as trained, minimizing distortion, unintended cropping, or compositional issues.
• Sampler Selection: The DPM++ 2M + Karras or DPM++ 2M + SGM Uniform sampler are highly recommended as a versatile default, offering an excellent balance of speed, detail, and stable convergence. They generally produces sharp, high-quality images. Euler Max can be more creative and produce more varied results even with the same seed. It's a good choice for exploration and artistic outputs, sometimes achieving good results with fewer steps (e.g., 30). DDIM or Uni PC are faster samplers but may occasionally lack the fine detail or textural richness of Karras-suffixed DPM samplers, especially at lower step counts. It is advised to experiment, as different samplers can interact differently with specific prompt types or desired aesthetics.
• aDetailer Setting: The base aDetailer setting works excellently (40 steps or more). However, aDetailer has one drawback: it consistently tends to homogenize faces. Therefore, if you desire greater facial variety, consider disabling it. • Positive Prompt(P1): 4n1m3rg3, Realistic, cowboy shot, asian, man, suit, simply background, 8K • Negative Prompt: ugly, low res, blurry, fat, wide hips, curvy, bad anatomy, worse hands, worse fingers, animal ears • Seed:851812959, Sampler:DPM++ 2M, Schedule:Karras Steps:50, CFG Scale: 7, Resolution: 768x1344 • Positive Prompt(P2) : Realistic, portrait, asian teen, cute, amazing kimono, simply background, 8K • Negative Prompt: ugly, low res, blurry, fat, wide hips, curvy, bad anatomy, worse hands, worse fingers, animal ears • Seed:3463806084, Sampler:DPM++ 2M, Schedule:Karras Steps:50, CFG Scale: 7, Resolution: 768x1344
(P1) - aDetailer: ON (P1) - aDetailer: OFF (P2) - aDetailer: ON (P2) - aDetailer: OFF ADETAILER COMPARISON
• Hires.fix (High-Resolution Fix):
My preferred method for detail and upscaling involves aDetailer and a subsequent pass with 4x-UltraSharp, rather than Hires.fix. Nevertheless, for those who wish to use Hires.fix, the following settings are recommended as a general guideline Recommendation: Generally recommended for achieving final high-quality outputs, especially for portraits, detailed scenes, or when upscaling is desired directly within the generation pipeline. Upscaler: Latent (bicubic antialiased): is a good neutral starting point that preserves detail well. For more stylized or sharper results, R-ESRGAN 4x+ Anime6B (for anime/illustration) or 4x-UltraSharp can be effective.
Hires Steps: Often set to 10-20 steps, or approximately 0.5 times the initial sampling steps.
Denoising Strength: This is a crucial parameter, typically in the range of 0.3 to 0.6. Lower values (0.3-0.45) preserve the composition and details from the initial low-resolution generation more faithfully. Higher values (0.5-0.6) allow the upscaler to add more new detail and potentially alter the image more significantly. A starting point of 0.4-0.45 is often advisable.
• VAE (Variational Autoencoder): OmnIA© includes an integrated VAE that is equivalent to the standard VAE. Users do not typically need to select or load an external VAE for optimal performance.
4.4.0. LoRA and Textual Inversion Compatibility General Compatibility: Compatibility with LoRAs and Textual Inversions is currently undergoing testing. Initial Test Observations: Preliminary tests suggest that LoRAs specifically created for illustriousXL exhibit better (but probably not full) compatibility with OmnIA© compared to LoRAs designed for Pony-based models.
Recommended Usage (Based on Early Findings): While further testing is ongoing, users are encouraged to experiment. When using LoRAs, especially those observed to be more compatible (e.g., from illustriousXL), typically integrate seamlessly; adjust weights (commonly 0.6-0.8) as needed. When using other style LoRAs, it's advisable to start with lower weights (e.g., 0.4-0.7) and adjust, particularly if they conflict strongly with OmnIA©'s inherent style or explicit style keywords.
Observed Synergies/Conflicts: Detail-enhancing LoRAs can be particularly effective with OmnIA©. As noted, compatibility seems stronger with LoRAs developed for illustriousXL. Some older, heavily stylized LoRAs trained on vastly different model bases, or those not specifically optimized for OmnIA© or its observed compatible styles, may not blend as seamlessly or might require significant weight adjustment and careful prompting.
Testing: Users are strongly encouraged to experiment with LoRA weights and combinations to find optimal blends and effects for their desired output.
4.4.0. How to Prompting • EXAMPLE OF GENERAL PROMPT: It is recommended, but not mandatory, to follow this type of prompt construction. It is also suggested to start with a simple prompt and then gradually add details to the prompt POSITIVE PROMPT: [TAG], [STYLE], [TYPE OF SHOOT], [SUBJECT], [DESCRIPTION], [BACKGROUND], [MORE DETAILS] • EXAMPLE: 4n1m3rg3, realistic, portrait, asian teen, cute, amazing kimono, simply background, silver ornaments, 8K BASE NEGATIVE PROMPT: low res, blurry, bad anatomy, worse hands, worse fingers, animal ears
• EXAMPLE OF CASUAL POSITIVE PROMPT: • EXAMPLE: 4n1m3rg3:1.0, bald man, african man, black skin, eyeglasses, tourist, in white linen shirt, beige shorts, selfie, sitting in boat, city background, • NEGATIVE PROMPT: ugly, low res, blurry, fat, wide hips, curvy, bad anatomy, worse hands, worse fingers, animal ears, nsfw, ornaments, hair, white skin, cars, Seed:1957828419, Sampler:DPM++ 2M, Schedule:Karras Steps:25, CFG Scale: 7, Resolution: 768x1344
ADETAILER: OFF ADETAILER: ON 4.5 Overtraining with 4n1m3rg3 Despite significant efforts made during OmnIA®'s training phase to prevent excessive specialization and maintain intrinsic stylistic versatility, a phenomenon akin to overtraining has been observed in relation to the use of the 4n1m3rg3 keyword. 4.5.1. How to mitigate the overtraining Fortunately, a simple and effective solution exists to mitigate the overtraining problem associated with 4n1m3rg3. Users can simply include a weight (strength) for the 4n1m3rg3 tag within their prompt. By progressively reducing the assigned weight to this keyword, it is possible to mitigate the overtraining. Practical Example of Mitigation: To reduce the effect of overtraining, one can gradually decrease the tag's weight: • From a standard usage like 4n1m3rg3 or 4n1m3rg3:1.0 (implicit or explicit weight) • One can transition to 4n1m3rg3:0.9, 4n1m3rg3:0.8, and so on, down to very low values such as 4n1m3rg3:0.1 or even 4n1m3rg3:0.0000000001 (adding one decimal place at a time for finer control). This granular approach allows users to find the "sweet spot" for their specific creative needs, maintaining the benefits of the realistic style mitigating the overtraining. • Core Prompt: 4n1m3rg3:[X.X], fashion photography, portrait, egyptian teen, egyptian queen, black hair, bob cut, dark tanned skin, perfect make-up, jewelry, golden headdress, captured in a walls adorned with hieroglyphics, 8K • Negative Prompt: ugly, low res, blurry, fat, wide hips, curvy, bad anatomy, worse hands, worse fingers, animal ears, • Seed:2158669849- Sampler:DPM++ 2M SDE - Schedule:Karras - Steps:35 - CFG Scale:3.5 - Resolution: 768x1344 - aDetailer: Active
0.1 0.01 0.001 0.0001 0.00001
0.000001 0.0000001 0.00000001 0.000000001 0.0000000001
4.5.2. Further Solution (Method 2: Dual Prompt Weighting for Realism Overtraining): An additional solution specifically addresses overtraining regarding realism when using the 4n1m3rg3 tag. Unfortunately, an overtraining regarding realism using the tag word 4n1m3rg3 was noticed too late in the development cycle. Fortunately, there is a movement to mitigate and limit this overtraining through the combined use of weights for the 4n1m3rg3 tag in both positive and negative prompts. • Use a fixed value on the positive prompt for the tag 4n1m3rg3, such as 4n1m3rg3:0.10. This ensures the style is still present but not overly dominant. • Simultaneously use a fixed value on the negative prompt, also for the tag 4n1m3rg3, such as: 4n1m3rg3:0.01 (even a bigger value like 4n1m3rg3:0.35 works to mitigate the problem). This actively pushes against the strong anime influence, helping to pull the image towards realism. (It is strongly recommended to use at least 35 steps for the DPM++ 2M SDE and DPM++ 3M SDE samplers for optimal results with this method.) Example Prompt demonstrating Method 2: • Core Prompt: 4n1m3rg3:1.0, realistic, photo, selfie, asian female, eyeglasses, tourist, in casual clothing, sitting in boat, city background, • Negative Prompt: 4n1m3rg3:0.35, ugly, low res, blurry, fat, wide hips, curvy, bad anatomy, worse hands, worse fingers, animal ears, nsfw, ornaments, hair, caucasian, cars, • Seed: 580508403 - Sampler: DPM++ 2M SDE- Schedule: Karras - Steps:35 - CFG Scale: 7 - Resolution: 768x1344 - aDetailer: Active
With Negative: 4n1m3rg3:0.35 Without
An additional effective solution for mitigating overtraining, particularly concerning specific stylistic elements, is detailed in “Section 4.6 Oh no! Another Model with Identical Faces?” 4.6 Oh no! Another Model with Identical Faces? We know what you're thinking: "Another model that always generates the same type of face? Not again!" Don't worry, we have a very easy solution to avoid this problem with OmnIA®. The trick is not to call any specific tags that influence the face shape. Instead of focusing on detailed face descriptors, let OmnIA® interpret the general context of your prompt. Examples and Prompts Used: realistic photo, cowboy shot, woman, simply background, Generation Recommendations: • CFG Scale: We strongly recommend setting the CFG Scale to 7. This value offers a good balance between prompt adherence and the model's creative freedom. • Negative Prompt: anime, cartoon, illustration, cgi, painting, veil, mole, painted face, animal ears, tiara, hair ornaments, forest, We suggest to you to Keep this base negative prompt similar to this helps avoid unwanted artifacts (You might adapt this negative prompt based on the specific style you wish to avoid.) With these simple adjustments, you can leverage OmnIA®'s versatility to generate a wide range of unique and interesting faces, without falling into the monotony of a single stylistic approach. 4.6.1. Demonstrating “NAOFM” (Not another one face model) Example 4.6.2: Woman • Image Description: realistic photo, cowboy shot, woman, simply background, • Negative Prompt: anime, cartoon, illustration, cgi, painting, veil, mole, painted face, animal ears, tiara, hair ornaments, forest, • Seed: 336164113 - Sampler:DPM++ 2M SDE - Schedule:Karras - Steps:50, CFG Scale:7, Resolution: 768x1344, aDetailer: On/Off
aDetailer: on
aDetailer: on aDetailer: on
Original Grid aDetailer: On Original Grid aDetailer: Off
aDetailer: off
aDetailer: off aDetailer: off
aDetailer: on
aDetailer: on aDetailer: on
aDetailer: off
aDetailer: off aDetailer: off
Example 4.6.3: Woman with negation style - (explanation: 6.3.7 paragraph) • Image Description: hyper realistic photo, cowboy shot, captivate woman, realistic skin, simply background, 8k • Negative Prompt: anime, cartoon, illustration, cg, painting, ugly, joker smile, 4n1t00n:0.9, 4nm1cs:0.9, veil, mole, painted face, animal ears, tiara, hair ornaments, forest, • Seed:1881410303 - Sampler:DPM++ 2M SDE - Schedule:Karras - Steps:50, CFG Scale:5.5, Resolution: 768x1344, aDetailer: On/Off
aDetailer: on aDetailer: on aDetailer: on
Original Grid aDetailer: On Original Grid aDetailer: Off
aDetailer: off
aDetailer: off aDetailer: off
aDetailer: on
aDetailer: on aDetailer: on
aDetailer: off
aDetailer: off aDetailer: off
Example 4.6.4: Italian man - B&W - negation style - (explanation: 6.3.7 paragraph) • Image Description: realistic photo, (greyscale), gorgeous italian man, gentle face, hands in pockets, short hair, suit, necktie, city background, 8k • Negative Prompt: anime, cartoon, illustration, cg, painting, 4n1m1cs:0.01, 4n1t00n:0.01, ugly, joker smile, bow tie, girl, • Seed:2454184337 - Sampler:DPM++ 2M SDE - Schedule:Karras - Steps:50, CFG Scale:5, Resolution: 768x1344, aDetailer: On/Off
aDetailer: on aDetailer: on aDetailer: on aDetailer: on aDetailer: on
Original Grid aDetailer: On Original Grid aDetailer: Off
aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off
4.7 Super Fast Generation One of the key advantages and performance highlights of OmnIA® is its remarkable efficiency in image generation. Through optimized model architecture and careful parameter tuning during training, OmnIA® is capable of producing high-quality images with significantly fewer sampling steps compared to many contemporary generative models. By adhering to specific prompt configuration parameters, OmnIA® can generate compelling images with as few as 10 sampling steps. This dramatically reduces generation time, making OmnIA® particularly efficient for rapid prototyping, interactive creative workflows, and applications where speed is a critical factor. Furthermore, this rapid generation is achieved while maintaining excellent control over the output, thanks to a finely tuned Classifier-Free Guidance (CFG) scale. Optimal performance for these expedited generations is typically observed when the CFG scale is set between 2 and 3. It is important to note that this capability has been thoroughly tested and validated specifically with the DPM Adaptive or DPM++ 2M sampler with Karras scheduler. This narrow CFG range, coupled with the specified sampler, indicates OmnIA®'s strong inherent prompt adherence, allowing for effective guidance even with low guidance strength, which is crucial for faster inference times. This capability for super fast generation positions OmnIA® as a highly efficient tool for creators seeking both quality and speed in their image synthesis tasks.
4.7.1. Demonstrating Superfast Generation (using all tags) Example 4.7.2: Stylized Character Art • Image Description: [tag], upper body, cute succubus girl, (red skin), (glossy red skin), seductive, innocent, gothic, large demon red wings, long green hair, abstract art, half demon, yellow iris, cat eyes, demon horns, moonlight passing through hair, full moon background, 8K • Negative Prompt: low res, blurry, bad anatomy, worse hands, worse fingers, animal ears, • Seed:463726944 - Sampler:DPM++ 2M - Schedule:Karras - Steps:10, CFG Scale:2, Resolution: 768x1344, aDetailer: Active
4n1m3rg3 4n1v3rs3 4n1t00n 4n1m1cs No Tags SUPER FAST GENERATION – DPM++ 2M
It should be noted that other Samplers have also successfully utilized super fast generation, although some may probably require a higher number of steps for the generation of hands and feet. For more details, please refer to the following table. 4.7.3. Demonstrating Superfast Generation (using some other samplers) Example 4.7.4: Nagatoro Hayase® – Semi Realistic • Image Description: 4n1m3rg3:0.001, Full Body Shot, Nagatoro Hayase, dark skin, in Japanese school uniform, opaque stocking, white shirt, long skirt above the knees, blue skirt, classroom background, 8K • Negative Prompt: ugly, low res, blurry, fat, wide hips, curvy, bad anatomy, worse hands, worse fingers, animal ears • Seed: 3914081923 - Sampler:DPM++ 2M - Schedule:Various - Steps:10, CFG Scale:2, Resolution: 768x1344, aDetailer: On/Off
DPM++2M DPM++ SDE DPM++ 2M SDE DPM++ 2M SDE Heun DPM++ 2S a DPM++ 3M SDE Euler a aDetailer: on aDetailer: On aDetailer: On aDetailer: On aDetailer: On aDetailer: On aDetailer: On
aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off
Euler LMS Heun DPM2 DPM2 a DPM adaptive Restart aDetailer: On aDetailer: On aDetailer: On aDetailer: On aDetailer: On aDetailer: On aDetailer: On
aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off
DDIM UniPC LCM Euler Smea Dy Euler Smea Kohaku LoNyu Yog aDetailer: On aDetailer: On aDetailer: On aDetailer: On aDetailer: On aDetailer: On
aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off aDetailer: off
Original Grid aDetailer: On Original Grid aDetailer: Off
- Model Architecture and In-Depth Training Methodology This section details the technical underpinnings of the OmnIA© model.
5.1.0. Base Architecture OmnIA© is built upon a customized version of the UNet. Specific modifications were made to optimize for multi-style learning and high-resolution output, potentially involving adjustments to attention mechanisms, intermediate block structures, or embedding projections to better handle the diverse stylistic information in the training data.
5.2.0. Dataset Curation and Composition A meticulously curated dataset of 10,500 high-resolution images (average 2K resolution, approx. 2048x2048 pixels) formed the training basis.
5.2.1. Data Sources: Synthetic Data: A significant portion comprised synthetically generated images. This involved a combination of procedural generation techniques, AI-assisted artistic compositions, and potentially pre-rendered 3D assets with specific stylistic targets, crucial for targeted style learning and diverse concept coverage where high-quality open-source data was scarce. Open-Source Images: Carefully selected high-quality, public domain, or appropriately licensed images from platforms like Pexels were included to enhance realism, naturalness, and general diversity. Selection criteria emphasized high resolution, clear subject matter, good composition, varied and natural lighting conditions, and distinct stylistic attributes. Stylistic Distribution: The dataset was strategically balanced to ensure adequate representation across the target styles (realistic, semi-realistic, anime, western comics, 2.5D, etc.), allowing the model to learn distinct features for each.
5.3.0. Preprocessing and Data Augmentation Filtering: Rigorous filtering, both automated and manual, was applied to remove noisy, low-quality, blurry, watermarked, or stylistically inconsistent images. Tagging and Captioning: A detailed and consistent tagging/captioning strategy was employed. Images were captioned with comprehensive descriptive tags, including subject, action, environment, and explicit style indicators corresponding to OmnIA©'s keywords (e.g., adding "4n1t00n" tag to anime images). A hybrid approach combining automated tagging tools (e.g., a modified WD1.4 Tagger or similar for initial object/character recognition) followed by extensive manual review, correction, and refinement was employed to ensure accuracy and consistency of labels. Tag Shuffling: Implemented during training to improve model robustness and prevent overfitting to specific tag orders or co-occurrences. Controlled-Scale Data Augmentation: Standard augmentation techniques such as minor rotations, horizontal flips, slight color jitter, and controlled random cropping (while maintaining subject integrity) were applied to increase dataset variance and improve generalization without corrupting core stylistic features. 5.4.0. Training Configuration and Strategy Optimizer: AdaFactor was chosen for its memory efficiency with large models and its effectiveness in stabilizing training. Batch Size: A batch size of 2 was used, primarily constrained by VRAM limitations when training with high-resolution 2K images. Gradient accumulation was used to simulate larger effective batch sizes. Learning Rate Schedule: A Progressive Learning Rate Schedule was employed across 6 distinct phases, totaling 600 epochs: • Phase 1: Learning Rate 5e-6 for 50 epochs • Phase 2: Learning Rate 2.5e-6 for 100 epochs • Phase 3: Learning Rate 1e-6 for 150 epochs • Phase 4: Learning Rate 5e-7 for 100 epochs • Phase 5: Learning Rate 1e-7 for 50 epochs • Phase 6: Learning Rate 1e-8 for 150 epochs Rationale: This staged reduction allows for initial rapid learning of broad features, followed by progressively finer adjustments and tuning of details, promoting stable convergence and preventing premature overfitting or catastrophic forgetting. Total Steps: 12.600.000 Key Training Techniques: AdaFactor Adaptive Optimizer: As mentioned. Tag Conditioning: Direct and strong use of style tags and detailed descriptive captions to guide the learning process towards the desired stylistic and semantic outputs. Accumulation Steps for Bias Mitigation: Gradient accumulation was used not only to effectively increase batch size but also strategically employed with potential corrective weighting or focused sampling for underrepresented classes or styles to ensure more balanced learning across the diverse dataset. Clip Grad Norm: Utilized to prevent exploding gradients, ensuring training stability. MSE Strength (Mean Squared Error): While cross-entropy loss on token prediction is standard, attention was paid to ensuring sufficient Mean Squared Error (or similar pixel-level loss) contribution from the VAE's reconstruction component (if fine-tuning VAE simultaneously) or through other means to maintain pixel-level fidelity and texture accuracy. 6. Comparative Analysis: OmnIA© vs. Pony XL and Illustrious XL(2) This section provides a schematic, descriptive comparative analysis of OmnIA© v1.0 against Pony XL, and a contemporary high-quality illustrative/anime model IllustriousXL. The aim is to highlight OmnIA©'s distinct characteristics and improvements across its primary supported styles. For each comparison, the same conceptual seed, CFG scale, sampler, and core prompt elements are assumed to ensure a fair assessment, focusing on stylistic rendering and adherence.
6.1.0. Methodology for Comparison OmnIA© v1.0 -> Pony XL (tested version)-> IllustriousXL (tested version) Core Parameters (Assumed Consistent): • Seed: A consistent conceptual seed for each prompt set. • CFG Scale: [Variable] • Sampler: [Variable] • Steps: 50 • Resolution: 768x1344 (unless aspect ratio is key to style, then adjusted appropriately, e.g., 1280x1280 for a portrait) • Assessment Focus: Stylistic accuracy to the target style keyword/description, prompt adherence, overall image quality, and unique characteristics.
6.2.0. Qualitative Assessment Criteria • Stylistic Fidelity: How accurately the model renders the intended artistic style (e.g., realism, anime, 2.5D).
• Prompt Element Adherence: How well the model incorporates specific subjects, actions, and details from the prompt.
• Image Coherence and Aesthetics: Overall visual appeal, absence of artifacts, anatomical/structural correctness.
• Keyword Responsiveness (for OmnIA©): How effectively OmnIA©'s specific style keywords influence the output compared to general prompting on other models.
6.3.0. Style-by-Style Comparative Descriptions 6.3.1. Style: Realistic/Semi-Realistic (OmnIA© Keyword: “4n1m3rg3”) Core Prompt: 4n1m3rg3., Realistic, Cowboy Shot, mecha (black robot), black glass full helmet, not visible face, red cape, floating cape, yellow glowing eyes, science fiction, torn clothes, glowing, mechanical joints, black|yellow metal color, intense sunlight, outdoors, landscape, cinematic lighting, amazing quality, wallpaper, depth of field, dynamic angle, sharp, 8K Negative Prompt: ugly, low res, blurry, fat, wide hips, curvy, bad anatomy, worse hands, worse fingers, animal ears,, gundam, face, nose, mouth, human, man, guy, boy, female, woman, girl Seed:404785214 - Sampler: DPM++ 2M - Schedule:Karras - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344, aDetailer: Inactive OmnIA© Pony XL IllustriousXL REALISM – ORIGINAL GRID DOWNLOAD
• OmnIA© with “4n1m3rg3” (Described Output): Omnia v1.0, leveraging the "4n1m3rg3" and "Realistic" keywords, produces a highly detailed and substantial mecha. The rendering emphasizes intricate mechanical joints, metallic textures, and realistic lighting, creating a powerful and grounded presence. The red cape appears to have a more tangible fabric quality, and the overall image achieves a sharp, cinematic quality with excellent depth of field in the outdoor landscape.
• Pony XL (Described Output): ponyDiffusionV6XL generates a stylized yet imposing mecha, focusing on strong silhouettes and vibrant glowing effects. While maintaining the core elements of the black robot with a red cape and yellow glowing eyes, its interpretation leans towards a more illustrative or artistic rendering rather than pure photorealism.
• IllustriousXL (Described Output): IllustriousXL offers a highly stylized and dynamic interpretation of the mecha. Its strengths in expressive and bold visuals are evident, presenting a more abstract or conceptual take on the robot. The design is distinct, possibly with exaggerated features or a unique silhouette that highlights its artistic style. 6.3.2. Style: 2.5D Illustration (OmnIA© Keyword: “4n1v3rs3”) Core Prompt: 4n1v3rs3, spacepunk knightgirl with a stern look, holding a weapon, jumping, character design inspired by Yoji Shinkawa and Tsutomu Nihei, black|yellow, battle-worn armored and elegant, rubble background, depth of field, dynamic angle, fashion photography, sharp, hyperdetailed:1.15 Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, Seed: 2166983790 - Sampler:DPM++ 2M - Schedule:Karras - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344 - aDetailer: Active
OmnIA© Pony XL IllustriousXL 2.5D – ORIGINAL GRID DOWNLOAD
• OmnIA© with “4n1v3rs3” (Described Output): The "4n1v3rs3" keyword in OmnIA© guides it to create an image with a strong sense of depth and form, characteristic of 2.5D illustration. Expect well-defined character models, intricate armor details, and a dynamic pose, all rendered with a polished, digitally painted aesthetic. The vibrant colors are true to the "fantasy colors" or anime aesthetic, resulting in a professional-looking concept art piece for an armored female warrior.
• Pony XL (Described Output): This model produces a anime-style illustration of the armored figure. While capable of generating beautiful fantasy characters, the specific "2.5D" feel – the balance between illustrative flatness and perceived depth – might be less pronounced or require more explicit prompting to achieve compared to OmnIA©'s targeted keyword. The character's design and coloring are strong, but the sense of dimensionality might be softer or more traditionally illustrative.
• IllustriousXL (Described Output):
This is a strong area for IllustriousXL. It would likely produce a vibrant fantasy illustration. The comparison would focus on the specific nuances of the "2.5D" aesthetic – OmnIA©'s “4n1v3rs3” might offer a particular take on depth and rendering that differs subtly from IllustriousXL's default illustrative style.
6.3.3. Style: Anime (OmnIA© Keyword: “4n1t00n”)
Core Prompt: 4nit00n, Cowboy Shot, Nagatoro Hayase, dark skin, light smile:0.3, in Japanese school uniform, white shirt, long skirt above the knees, blue skirt, classroom background, depth of field, dynamic angle, fashion photography, sharp, hyperdetailed:1.15
Negative Prompt: CG, 3D, realistic, ugly, low res, blurry, fat, wide hips, curvy, topless, bad anatomy, worse hands, worse fingers, old, nude, naked, nsfw,
Seed: 775240691 - Sampler:Euler Max - Schedule:Karras - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344 - aDetailer: Active
OmnIA© Pony XL IllustriousXL ANIME STYLE – ORIGINAL GRID DOWNLOAD
• OmnIA© with “4n1t00n” (Described Output): With the "4nit00n" keyword, Omnia v1.0 produces a highly polished and detailed rendition of the character in the specified uniform. The model excels at rendering nuanced expressions and textures, giving a sophisticated feel to the anime style. The classroom background shows good depth of field, contributing to a professional "fashion photography" aesthetic within the anime context, making the character appear grounded and detailed.
• Pony XL (Described Output): This model delivers a clean and recognizable interpretation of the character in the school uniform. While the style is clearly anime, it leans towards a more classic or simplified. The background might be less intricate or have less pronounced depth of field, and the overall image might feel more like a direct, stylized anime cel than a "fashion photography" inspired render.
• IllustriousXL (Described Output):
IllustriousXL generates a vibrant and distinct anime interpretation. Its strength lies in its strong stylistic tendencies. The depiction of the character, while adhering to the prompt, might showcase more exaggerated or stylized elements inherent to IllustriousXL's default rendering. The "fashion photography" and "depth of field" aspects might be present but filtered through its unique, often more dynamic and less purely realistic, anime lens.
6.3.4. Style: Western Comics (OmnIA© Keyword: “4n1m1cs” and “Western Comics”)
Core Prompt: 4n1m1cs, Western Comics, cowboy Shot, Felicia Hardy, cute, seductive, innocent, light smile:0.3, plump lips, slender body, long luminous neon white hair, wavy hair, in high detailed Black Cat suit, tight black suit with white accents, gloves of her costume form retractable claws at the fingertips, Sitting on the armchair, sipping a cup of coffee, living room of her loft, depth of field, dynamic angle, fashion photography, sharp, hyperdetailed:1.15
Negative Prompt: CG, 3D, realistic, ugly, low res, blurry, fat, wide hips, curvy, topless, bad anatomy, worse hands, worse fingers, old, nude, naked, nsfw,
Seed: 4108034777 - Sampler:Euler Max - Schedule:Karras - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344 - aDetailer: Active
OmnIA© Pony XL IllustriousXL WESTERN COMICS – ORIGINAL GRID DOWNLOAD
• OmnIA© with “4n1m1cs” and “Western Comics” (Described Output): Leveraging the "4n1m1cs" keyword and "Western Comics" style, Omnia v1.0 produces a highly detailed and polished interpretation of Black Cat. The rendering exhibits intricate details on the suit, expressive facial features, and a clear sense of depth and lighting, aligning with a "fashion photography" aesthetic for a comic book character. The overall image presents a refined and modern comic art style, emphasizing sharpness and hyperdetail.
• Pony XL (Described Output): ponyDiffusionV6XL delivers a stylized yet recognizable version of Black Cat, leaning towards a more classic or exaggerated Western comic book art style. While capturing the character's essence and costume, the level of detail might be less and the overall aesthetic might appear more illustrative or flat.
• IllustriousXL (Described Output): IllustriousXL offers a highly stylized and distinctive interpretation of Black Cat. Its output often features bold lines, dramatic posing, and a strong sense of artistic flair, which can result in a unique take on anime comics. While adhering to the core prompt, the rendering might abstract certain details or lean into a more graphic novel-like appearance. 6.3.5. Style: Animation-Inspired (e.g., using “Pixar” in prompt with OmnIA©) Core Prompt: Pixar, Full Body Shot, girl, big eyes, cute, light smile:0.3, plump lips, slender body, coloured hair, small tits, flirting, high detailed dress, barefoot, sitting on the grass, Picnic atmosfere, public park, sunset, depth of field, Negative Prompt: realistic, ugly, low res, blurry, fat, wide hips, curvy, topless, bad anatomy, worse hands, worse fingers, old, nude, naked, nsfw, Seed: 525329959 - Sampler:Euler a - Schedule:Automatic - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344 - aDetailer: Active
OmnIA© Pony XL IllustriousXL PIXAR STYLE – ORIGINAL GRID DOWNLOAD
• OmnIA© with “pixar” (Described Output): Omnia delivers a character with a highly polished and detailed aesthetic, closely resembling the modern Pixar style. The rendering focuses on intricate details of the dress and the character's features, with a strong sense of depth in the public park background during sunset. The image appears professional and refined, aiming for a high-fidelity animated look with precise control over details and lighting.
• Pony XL (Described Output): ponyDiffusionV6XL produces a charming rendition that captures the essence of a cute, big-eyed character in a simplified yet effective cartoon style. While it aims for the "Pixar" feel, it leans towards a more classic or slightly more abstracted animation aesthetic. The background is simpler, and the overall image might lack the very fine details and realistic lighting, offering a more generalized animated appearance.
• IllustriousXL (Described Output): IllustriousXL creates a distinct and stylized animated character. It excels at vibrant colors and dynamic forms, often resulting in a unique take on the animation style. While it incorporates the "Pixar" prompt, its output might present a more exaggerated or artistic interpretation.
6.3.6. Style: Animation-Inspired (e.g., using “Disney” in prompt with OmnIA©) Core Prompt: Disney, realistic, Portrait, Rapunzel, cute, seductive, innocent, light smile:0.3, plump lips, slender body, long blonde hair, green eyes, cinematic light, purple dress, happy, medieval castle background, depth of field, dynamic angle, fashion photography, sharp, hyperdetailed:1.15 Negative Prompt: ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw,
Seed:3923521271 - Sampler:DPM++ 2M SDE - Schedule:Karras - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344 - aDetailer: Active
OmnIA© Pony XL IllustriousXL DISNEY STYLE – ORIGINAL GRID DOWNLOAD
• OmnIA© with “Disney style animation” (Described Output): Omnia v1.0 produces a highly detailed and expressive rendition of Rapunzel, strongly aligned with the "realistic" and "fashion photography" aspects of the prompt while maintaining a recognizable Disney aesthetic. The intricate details of her purple dress, the luminous quality of her long blonde hair, and the sharp focus on her features against a depth-of-field castle background create a cinematic and hyperdetailed portrait.
• Pony XL (Described Output): ponyDiffusionV6XL generates a beautiful and charming version of Rapunzel that captures the essence of her Disney character. It leans more towards a refined animated style rather than full photorealism, though it still incorporates elements of realism. The hair is voluminous and iconic, and the overall image is visually appealing, offering a balanced blend of Disney animation and high-quality rendering.
• IllustriousXL (Described Output): IllustriousXL provides a unique and stylized interpretation of Rapunzel, characterized by a more artistic and possibly ethereal quality. While adhering to the core elements like long blonde hair and a purple dress, its rendering might prioritize a painterly or illustrative effect over strict photorealism. 6.3.7. Style Blending (Example: (4n1t00n:X.X), (4n1m3rg3:X.X)”) • Core Prompt: Anime, flat colors, Dragon ball, android 18, skinny, short blonde hair, black bikini, cloudy blue sky, beach , sand, palm trees, water, aqua, sweaty, black thong, windy, (4n1t00n:X.X, (4n1m3rg3:X.X), Negative Prompt: realistic, CGI, 3D, illustration, 2.5D, childish, ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw,
Seed:41112174 - Sampler: UniPC - Schedule:Karras - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344 - aDetailer: Active
4n1t00n:1.8 - 4n1m3rg3:0.0 4n1t00n:1.0 - 4n1m3rg3:1.0 4n1t00n:0.3 - 4n1m3rg3:1.8
4n1t00n:1.2 - 4n1m3rg3:0.8 4n1t00n:0.8 - 4n1m3rg3:1.0 4n1t00n:0.5 - 4n1m3rg3:1.4 STILE BLENDING
6.3.7. Style Negation (Example: 4n1m1cs or 4n1m3rg3”) • Core Prompt: realistic girl, iranian, oversize jumper, dark redhead, rouge, darkred, neck lace choker, perfect makeup, warm lighting, cosy atmosphere, black and red:0.85, desert dune background, depth of field, dynamic angle, sharp, Negative Prompt: tag*, ugly, old, fat, wide hips, curvy, topless, nude, naked, nsfw, animal ears,
• Seed:2808249445 - Sampler: DPM++ 2M - Schedule:Karras - Steps:50 - CFG Scale: 5.5 - Resolution: 768x1344 - aDetailer: Active
*No Negative Tag *Tag = 4n1m1cs *Tag = 4n1m1cs:0.5
*Tag= 4n1m3rg3 *Tag = Western Comics *Tag = Anime Style NEGATION STYLE 6.4.0. Summary of Comparative Advantages for OmnIA© Based on the descriptive analysis above, OmnIA© is anticipated to demonstrate the following advantages: • Precise Stylistic Control: OmnIA©'s dedicated style keywords (e.g., 4n1m3rg3, 4n1t00n, 4n1v3rs3, 4n1m1cs) offer a more direct and reliable method for achieving specific artistic styles compared to relying solely on descriptive prompting with Pony XL or IllustriousXL (especially for styles outside IllustriousXL's core strengths, like Western Comics or strong photorealism).
• Consistent Style Blending: The ability to weight and combine style keywords allows OmnIA© to create nuanced and coherent stylistic hybrids that would be difficult to achieve predictably with the other models.
• High Baseline Quality Without Score Tags: OmnIA© is designed to produce high-quality outputs within its target styles without requiring "score tags," simplifying the prompting process.
• Versatility as a Foundational Model: OmnIA©'s broad stylistic coverage, from realism to various forms of illustration and anime, makes it a more versatile model than Pony XL (which is more general) or IllustriousXL (which is more specialized).
(2)This comparative section aims to illustrate the intended benefits of OmnIA©'s specialized training and keyword system. The comparison comments in chapter 6.0 between Omnia v1.0, ponyDiffusionV6XL, and IllustriousXL were generated (freely) by Gemini 2.5 Pro
6.5. ComfyUI and OmnIA© 6.5.1. Comprehensive Usage Guidelines and Parameter Optimization We want to extend a massive thank you to nuaion for their incredible work and long-standing collaboration on the OmnIA© project. Their creative vision and dedication have been absolutely vital in bringing OmnIA© to life and pushing its capabilities further than we ever imagined. nuaion has generously shared their expertise by creating an incredibly comprehensive tutorial. This guide, packed with various workflows, shows you exactly how to get the most out of OmnIA© using the ComfyUI interface. It's a true testament to their deep understanding and innovative approach. The time and effort nuaion has invested in helping us build and develop OmnIA© have been invaluable, and we are profoundly grateful for their contributions.
For anyone looking to unlock OmnIA©'s full potential, we highly recommend visiting nuaion's website at https://omnia.nuaion.com. There, you'll find the complete tutorial available for download, including all the essential workflows to kickstart your journey.
7. Known Limitations and Future Development Roadmap While OmnIA© represents a significant advancement, certain limitations are acknowledged, and a roadmap for future development is in place:
7.1.0. Current Problem and Limitations • Fine-grained Control within Styles: While broad styles are well-defined by keywords like 4n1t00n or 4n1m1cs, achieving very specific sub-styles (e.g., "Shinkai Makoto anime style" or "Frank Miller comic style") or the nuanced artistic signature of a particular artist within those broader categories may still require highly detailed prompting, careful use of artist names, or dedicated LoRAs.
• Complex Multi-Character Compositions: Generating scenes with numerous interacting characters maintaining perfect anatomical consistency, complex spatial relationships, and natural-looking interactions across all figures can still be challenging. This is a common issue in current generative models, though OmnIA© strives for good coherence.
• Text Generation: Like most current image generation models, reliable and contextually accurate generation of legible text (e.g., words on signs, book titles) within images is not a primary capability and often results in garbled or nonsensical letterforms.
7.2.0. Future Development Plans • Expanded Style Repertoire and Granularity: Research into incorporating additional, distinct artistic movements (e.g., Impressionism, Cyberpunk Art, Art Nouveau, specific historical art periods) as controllable keywords, and potentially adding modifiers for sub-styles.
• Enhanced Photorealistic Rendering: Further refinement of the 4n1m3rg3 style through targeted dataset augmentation with more complex real-world photographic examples focusing on diverse lighting, materials, and human subjects. Exploration of new training techniques to improve nuanced light interaction and material properties.
• Improved Scene Comprehension and Compositional Control: Focus on enhancing the model's ability to understand and render complex scenes with multiple subjects, better spatial reasoning, and more consistent character interactions. This may involve architectural tweaks, training on more compositionally complex datasets, or incorporating forms of compositional guidance.
• Negative Prompt Learning and Guidance: Investigating methods to make negative prompt conditioning even more effective, intuitive, and powerful, potentially through specific training objectives or model architectures more attuned to negative constraints.
• Community Feedback Integration: Actively soliciting and incorporating user feedback, common use-cases, and shared generation examples to identify areas for improvement and guide dataset expansion and feature development for subsequent versions of OmnIA©.
- Licensing, Availability, and Usage Terms 8.1.0. License OmnIA© is released under the CreativeML Open RAIL++-M License (modified). This license promotes open-source research, non-commercial use (with exceptions noted below), and community-driven improvements. Users are encouraged to consult the full license text for detailed terms.
8.2.0. Availability The OmnIA© model and associated resources (if any) will be publicly available for download on: Civitai: https://civitai.com/models/XXXXX/OmnIA©
8.3.0. Usage Terms and Conditions: 8.3.1 It is an explicit condition of OmnIA©'s release that access to and use of the model, or it's derivatives, for image generation must generally be provided completely free of charge, without requiring users to expend credits, virtual currencies, crypto currency, or any similar access-gating mechanisms for the model's core generative functionality.
8.3.2 Specific exceptions to this condition are granted exclusively to Civitai - https://civitai.com and Mage.space - https://www.mage.space. These platforms are permitted to utilize their respective platform-specific virtual currencies, crypto currency, or credit systems (such as "Buzz" on Civitai) in connection with image generation using the OmnIA© model.
8.4.1 All other platforms, entities, apps, services hosting or providing access to OmnIA©, or it's derivatives, are expressly and unequivocally prohibited from implementing any form of payment, credit system, virtual currency, crypto currency, (whether platform-specific or otherwise), or real-money transaction for the direct generation of images using OmnIA©, or it's derivatives. The use of OmnIA©, or it's derivatives, on any platform other than Civitai and Mage.space must remain entirely free of such charges or virtual currency requirements.
8.5.1 You can host the OmnIA© model or it's derivatives on all other platforms, entities, apps, services hosting or providing, that incorporate any kind of payment or the utilization of credits to create images, but only if it fully complies with paragraph 8.3.0, subparagraph 8.3.1 and 8.4.1.
8.5.2 If you intend to host the OmnIA© model or its derivatives on any platforms, entities, applications, or services that incorporate any form of payment or the utilization of credits for image creation, and which do not comply with paragraph 8.3.0, subparagraph 8.3.1, and 8.4.1 of this license, please follow these guidelines for contact:
• For the direct hosting of the OmnIA© model: Please contact us at: legal[dot]department[at]omnia-diffusion[dot]com.
• For the hosting of a derived model based on OmnIA©: Please contact the respective author(s) of that specific derived model.
8.5.3 Please state the full model name OmnIA© and include a link to the model card: https://civitai.com/models/XXXXX/OmnIA©
8.6.0 You are free to use the OmnIA© model, for commercial purposes in teams of 3 or less.
8.1 Fostering Collaboration: The Future of OmnIA® The development of OmnIA® v1.0 marks a significant milestone in generative image AI, offering a robust, multi-style foundational model designed with adaptability and control at its core. However, we firmly believe that the true potential and continued evolution of OmnIA® lie in the hands of the broader AI community, particularly the dedicated model trainers. OmnIA® has been meticulously engineered as a base model, specifically catering to the needs of trainers who seek a versatile and resilient foundation for their custom creations. It is our sincere hope that OmnIA® will become the preferred starting point for the development of highly specialized LoRAs, Embeddings, and entirely new derived models. These crucial additions, crafted by talented trainers, will further expand OmnIA®'s capabilities, allowing for unprecedented artistic expression and thematic focus. To underscore this commitment to collaboration and empowerment, the licensing for OmnIA® (CreativeML Open RAIL++-M - Modified) has been thoughtfully chosen. This license is specifically designed to grant trainers complete control and freedom over their derivative creations, ensuring that their valuable contributions are truly their own. We envision a vibrant ecosystem where trainers can confidently build upon OmnIA®, sharing their innovations and pushing the boundaries of what's possible. The future development of OmnIA® 2.0, and subsequent iterations, is intrinsically linked to the collective efforts and enthusiastic reception from the entire community. We humbly invite all trainers to explore OmnIA®, create their unique LoRAs, Embeddings, and derived models, and share their findings. Your feedback, insights, and shared creations will directly shape the trajectory of OmnIA®, transforming it into an even more powerful and versatile tool for artists and creators worldwide.
8.2 Community Collaboration: Promote OmnIA® with Our Logos We are committed to fostering a strong and collaborative community around OmnIA®. If you appreciate the capabilities of our model and wish to help us expand its reach and recognition, we kindly invite you to consider featuring the OmnIA® logo on your derivative works. By including one of the official OmnIA® logos (PSD or PNG) in the cover image or promotional materials of your custom models, LoRAs, or Embeddings derived from OmnIA®, you not only acknowledge its foundational role but also contribute significantly to increasing awareness within the broader generative AI ecosystem. This gesture of support is invaluable to us and to the continued development of future OmnIA® iterations.
Download all Omnia Logos Here
We believe that true innovation flourishes through shared effort, and your contribution in promoting OmnIA® helps strengthen the entire community that builds upon it. Thank you for your support! 8.3 Important Notice: Responsible Use and Community Integrity We are fully aware that there will be platforms, entities, applications, and services that may host OmnIA® and its derivatives without fully respecting the terms of use and the licensing agreement. The internet, unfortunately, does not forgive, and unauthorized distribution is a persistent challenge. if you are considering sharing OmnIA® or any of its derived models on such non-compliant sites or applications, and you are not the original author of that specific model, we urge you to reconsider before proceeding. The reason is simple and fundamental: OmnIA® was designed to be completely free to use, incurring absolutely no cost to its users. By uploading OmnIA® or its derivatives to platforms that monetize generative AI without proper licensing or revenue sharing, you are, first and foremost, causing harm to yourselves as users, to the entire community of creators, and to the respective authors of these models. Many of these sites and applications exploit the hard work of others, offering nothing in return for the immense effort invested by model authors. OmnIA® represents the culmination of over six months of dedicated development and countless hours of work. These platforms often do nothing more than upload models and then charge users for their usage, without sharing any of their profits with the respective authors and without allowing authors the ability to block the use of their own models. Therefore, we extend our infinite gratitude to all users who will actively help to highlight or enforce the terms of use and license of OmnIA® and its derived models on these platforms and applications, which to call "unprofessional" would be an understatement. Your vigilance and support are crucial in maintaining the integrity and collaborative spirit that OmnIA® was built upon.
- Author Information and Contact • Lead Developer: Samuele Bonzio (Samael1976) • Website: https://www.omnia-diffusion.com • Civitai Profile: https://civitai.com/user/Samael1976 • Ko-fi (Support/Donations): https://ko-fi.com/samael1976 • Email for Inquiries: team[at]omnia-diffusion[dot]com
Conclusion and Call for Contribution OmnIA© represents a dedicated effort to provide the creative AI community with a powerful, versatile, and intuitive multi-style image generation model. Its design philosophy prioritizes user control, high-quality output, and freedom from overly restrictive prompting paradigms. The inclusion of distinct style keywords, coupled with the model's inherent ability to infer style, offers a flexible approach for both novice and advanced users. Users are encouraged to thoroughly explore OmnIA©'s capabilities, experiment with its stylistic range, and share their findings and creations within the community. Feedback regarding performance, desired features, or potential areas for improvement is highly valued and will contribute significantly to the ongoing development and refinement of future iterations of OmnIA©. The aim is to foster a collaborative environment where OmnIA© can continue to evolve and serve the diverse needs of artists, designers, and AI enthusiasts.
Acknowledgements: • To Nerogar, for his fantastic One Trainer, which I used to create OmnIA and other models: https://github.com/Nerogar/OneTrainer
• To AUTOMATIC1111, because I just can't get used to any other interface besides A1111: https://github.com/AUTOMATIC1111/stable-diffusion-webui
• To Bing-su, thanks for aDetailer, which I always use for everything: https://github.com/Bing-su/adetailer
• To Kohya S., without whom I would have never started training: https://github.com/kohya-ss/sd-scripts
To the entire Civitai Italia Telegram group, who supported and put up with my numerous ramblings. Who helped me when, after a OneTrainer update, nothing seemed to work. In particular: • GattaPlayer, nuaion, MarkWar, WV, Chuck
To Monica Donazzan, our talented Graphic Supporter, for her invaluable contributions to the visual aspects of this project. Following her work on Instagram: https://www.instagram.com/monicadonazzan
- The Genesis of a Name: Why OmnIA®? The choice of "OmnIA®" as the name for our multi-style generative image model is deeply rooted in its core capabilities and embodies a clever linguistic play. Derived from Latin, "Omnia" directly translates to "all" or "everything." This meaning perfectly encapsulates the model's fundamental design philosophy: to provide a comprehensive, all-encompassing solution for image generation across a vast and diverse spectrum of artistic styles. From photorealism to anime, and from Western comics to 2.5D, OmnIA® aims to be a single, versatile tool capable of handling "all" creative demands. Beyond its Latin roots, the name also incorporates an intentional linguistic nuance, particularly relevant for Italian speakers. The final two letters, "IA," in Italian, stand for "Intelligenza Artificiale" (Artificial Intelligence). This is a direct nod to the model's nature as an AI-driven system, and it leverages the common Italian linguistic structure where the adjective often follows the noun, making the "IA" suffix a natural and recognizable indicator of its technological essence. This dual meaning, combining the model's "all-encompassing" stylistic range with its identity as an "Artificial Intelligence," makes "OmnIA®" a fitting and memorable designation for our project.
Important Notes: 1. Generation Variations: Generated images may vary based on your PC/Software configuration and the type of sampler used.