- β this one β Minimalist T2I Workflow for WAN Video 2.2
- β Minimalist First-Last Frame to Video Workflow for WAN Video 2.2
-
π
Minimalist FMLF (First-Middle-Last Frame) + Multi Frame Ref To Video Workflow for WAN Video 2.2(Coming Soon) -
π
Join Videos (Snippets) β Track Operations, Python Scripting: Be a Storyteller - Seamless Narrative Chain(Coming Soon)
Minimalist T2I Workflow for WAN Video 2.2
A streamlined text-to-image workflow utilizing WAN Video 2.2's High and Low Noise models (14B fp8) for static image generation. This setup deliberately omits the Lighting LoRAs to focus on the base models' capabilities.
WAN Video 2.2 is abused here as a Text-to-Image generator: generation begins from an empty latent noise frame, refined through a two-stage sampling process (High Noise β Low Noise) using the ModelSamplingSD3 node (with shift=5β10) to optimize the noise schedule for high-fidelity diffusion. This approach yields remarkably strong results because the VAE and text encoder are sourced from the WAN Image 2.1 family β the very components responsible for high-fidelity, text-driven visual synthesis β while the underlying video diffusion models is used solely as a processing backbone.
@Jay
Note, with regard to 'abused'! Wan2.2βT2Vβ14B does not have a built-in Text-to-Image generation capability. It is not an image generator, but a video generator. On the model page, there is no explicit mention that the model also supports textβtoβimage, unlike the previous model Wan2.1 where this functionality was directly indicated. (See also the use of the VAE and text encoder from the Wan2.1 family mentioned above.) βΊοΈ
Workflow Structure
The workflow includes both sampling approaches:
- Active path: Standard KSampler β connects to VAE Decode via
LATENT_STANDARDset/get nodes - Alternative path: KSampler Advanced β available via
LATENT_ADVANCEDset/get nodes - Simply reconnect the VAE Decode input to switch between sampling methods
Key Features
Dual-stage sampling: Sequential processing with High Noise β Low Noise models
Precise control: ModelSamplingSD3 nodes (shift parameter: 5-10) for refined sampling behavior
ModelSamplingSD3 Node - shift parameter: - Want more creative/varied results? β Increase the shift value (7-10) - Need more precise/controlled predictions? β Decrease the shift value (3-5) Note: The shift parameter (default:5) controls the noise schedule. Higher values give the model more creative freedom, lower values enforce stricter prompt adherence.Flexible sampling options:
- KSampler (Advanced) is also integrated and can be connected to VAE Decode via the set/get node system.
- Primary path uses standard KSampler (Default 20 steps, CFG 2.5, res_multistep, sgm_uniform)
- Standard KSampler is used by subjective preference - the workflow feels more stable and produces subjectively better results, though there's no technical reason for this
- Fixed seed 0 for high noise stage, randomized seed for low noise stage
Optional: Fine-tuning for even better results Steps: 40β60 β ideal for res_multistep CFG Scale: 4β6 β enough control without excessive βoverfittingβ Fix Seed β for consistent variations Lose Noise: 0.56β0.76 β depending on desired texture roughness, default: 1Art style testing: Optimized for evaluating the ability to represent artistic styles, techniques, and compositions, without the complexity of additional conditioning through LoRAs and CNet.
- Prompting: Use natural language descriptions and supplement them with keywords.
- Structure: Use natural language over keywords, avoid overspecification of details.
- Artistic trade-offs are explicitly allowed, screw the critics π
"Dramatic impasto artwork with a touch of abstract expressionism" β This describes a painterly, expressive, textureβintensive style, strong color contrasts, irregular surfaces, a βlivelyβ way of painting β often not perfectly symmetrical, not hyperrealistic, not cleanly rendered. AND "Monochromatic scheme, primarily black/white/gray" + "detailed textures and dramatic lighting" + "Concept art, digital painting, matte painting, megascans" β This points toward a digital, controlled, cinematic, almost photorealistic look. β Matte paintings with realistic lighting and textures (often from Megascans). Monochrome = reduced color palette β more βstylized,β but not necessarily abstract. β Here itβs about clarity, fidelity of detail, composition, and atmosphere of light β often with clean lines and structured textures.- Prompting: Use natural language descriptions and supplement them with keywords.
Showcase of different art styles tested with this workflow Dare to click β opens fixed-size copy.
Keywords Image 1, 2, 3: #RomanticArt #DramaticPortraiture #FantasyFigurative #ExpressiveBrushwork #DiagonalComposition #MythicAesthetic #ExpressiveArt
Keywords Image 1, 2: #DarkFantasy #DramaticPortraiture #GothicElegance #Melancholy #Cinematic #CGI #HyperRealistic #NoPhotoRealistic (#BlackAndWhite) #DarkRomanticismus
Keywords Image 3: #CinematicArt #DramaticPortraiture #FantasyFigurative #ExpressiveBrushwork #AsymmetricalComposition #SymbolicArt #MythicAesthetic
Keywords Image 1, 2: #RomanticFantasyArt #DramaticRealism #DigitalOilPainting #NatureAndFigure #RomanticDrama #DigitalRomanticArt #DramaticLighting, #MalerischeDigitalKunst #ArtisticRealism #NoPhotorealism #NoPhotoRealistic
Keywords Image 3: #SymbolicArt #Surrealism #DramaticConceptArt #AtmosphericBrushwork #AsymmetricalComposition #MonochromaticWithAccent #MinimalistAesthetic
Requirements
β οΈ Note: All model links below are direct download links. Clicking them will immediately start downloading the files.
- wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
- wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
- umt5_xxl_fp8_e4m3fn_scaled.safetensors
- wan_2.1_vae.safetensors
Installation
- Download all required model files (see Requirements section)
- Place files in their respective ComfyUI directories:
ComfyUI/
βββ models/
β βββ diffusion_models/
β β βββ wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
β β βββ wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
β βββ text_encoders/
β β βββ umt5_xxl_fp8_e4m3fn_scaled.safetensors
β βββ vae/
β βββ wan_2.1_vae.safetensors
- Load the workflow JSON file in ComfyUI
- Adjust resolution in the EmptySD3LatentImage node based on your VRAM
Performance
Start with lower resolutions (832x1216px) if you have <= 16GB VRAM
<= 16GB VRAM (Tested on non-RTX 4090 cards)
- Standard resolutions: 720x1280px, 832x1216px, 832x1248px (portrait), 1280x720px, 1216x832px, 1248x832 (landscape)
- Achieves decent generation speed at these resolutions
24GB VRAM
- Higher resolutions: 1024x1536px (portrait), 1536x1024px (landscape)
- Ultra-wide: 1536x672px (21:9 aspect ratio)
- Recommended for larger outputs and wider aspect ratios
Credits
- Based on WAN Video 2.2 by Alibaba Group
Model tree for GegenDenTag/comfyui-wan-video-2.2-t2i-art-workflow
Base model
Wan-AI/Wan2.2-T2V-A14B