Skip to content

Models

List of popular text-to-image generative models with their respective parameters and architecture overview

Publisher Model Version Size Diffusion Architecture Model Params Text Encoder(s) TE Params Auto Encoder License
StabilityAI Stable Diffusion 1.5 2.28GB UNet 0.86B CLiP ViT-L 0.12B VAE OpenRAIL
StabilityAI Stable Diffusion 2.1 2.58GB UNet 0.86B CLiP ViT-H 0.34B VAE OpenRAIL
StabilityAI Stable Diffusion XL 6.94GB UNet 2.56B CLiP ViT-L + ViT+G 0.12B + 0.69B VAE OpenRAIL
StabilityAI Stable Diffusion 3.0 Medium 15.14GB MMDiT 2.0B CLiP ViT-L + ViT+G + T5-XXL 0.12B + 0.69B + 4.76B 16ch VAE Proprietary
StabilityAI Stable Diffusion 3.5 Medium 15.89GB MMDiT 2.25B CLiP ViT-L + ViT+G + T5-XXL 0.12B + 0.69B + 4.76B 16ch VAE Proprietary
StabilityAI Stable Diffusion 3.5 Large 26.98GB MMDiT 8.05B CLiP ViT-L + ViT+G + T5-XXL 0.12B + 0.69B + 4.76B 16ch VAE Proprietary
StabilityAI Stable Cascade Medium 11.82GB Multi-stage UNet 1.56B + 3.6B CLiP ViT-G 0.69B 42x VQE Proprietary
StabilityAI Stable Cascade Lite 4.97GB Multi-stage UNet 0.7B + 1.0B CLiP ViT-G 0.69B 42x VQE Proprietary
Black Forest Labs Flux 1 Schnell 32.93GB MMDiT 11.9B CLiP ViT-L + T5-XXL 0.12B + 4.76B 16ch VAE Apache 2.0
Black Forest Labs Flux 1 Dev 32.93GB MMDiT 11.9B CLiP ViT-L + T5-XXL 0.12B + 4.76B 16ch VAE Proprietary
Black Forest Labs Flux 1 Kontext-Dev 32.93GB MMDiT 11.9B CLiP ViT-L + T5-XXL 0.12B + 4.76B 16ch VAE Proprietary
Black Forest Labs Flux 1 Krea-Dev 32.93GB MMDiT 11.9B CLiP ViT-L + T5-XXL 0.12B + 4.76B 16ch VAE Proprietary
lodestones Chroma 48 26.84GB MMDiT 8.9B CLiP ViT-L + T5-XXL 0.12B + 4.76B 16ch VAE Apache 2.0
Ostris Flex 1 Alpha 25.65GB MMDiT 8.16B CLiP ViT-L + T5-XXL 0.12B + 2.95B 16ch VAE Apache 2.0
Ostris Flex 2 Preview 25.65GB MMDiT 8.16B CLiP ViT-L + T5-XXL 0.12B + 2.95B 16ch VAE Apache 2.0
FreePik F-Lite 19.81GB MMDiT 9.8B T5-XXL 2.95B 16ch VAE OpenRAIL
FreePik F-Lite Texture 19.81GB MMDiT 9.8B T5-XXL 2.95B 16ch VAE OpenRAIL
FreePik F-Lite 7B 13.89GB MMDiT 7B T5-XXL 2.95B 16ch VAE OpenRAIL
NVLabs Sana 1.5 1.6B 9.49GB MMDiT 1.60B Gemma2 2.61B DC-AE Proprietary
NVLabs Sana 1.5 4.8B 15.58GB MMDiT 4.72B Gemma2 2.61B DC-AE Proprietary
NVLabs Sana 1.0 1600M 12.63GB MMDiT 1.60B Gemma2 2.61B DC-AE Proprietary
NVLabs Sana 1.0 600M 7.51GB MMDiT 0.59B Gemma2 2.61B DC-AE Proprietary
nVidia Cosmos-Predict2 T2I 2B 13.32GB MMDiT 1.96B T5-XXL 4.86 WAN-VAE Proprietary
nVidia Cosmos-Predict2 T2I 14B 37.36GB MMDiT 14.26B T5-XXL 4.86 WAN-VAE Proprietary
FAL AuraFlow 0.2 31.90GB MMDiT 6.8B UMT5 12.1B VAE Apache 2.0
FAL AuraFlow 0.3 31.90GB MMDiT 6.8B UMT5 12.1B VAE Apache 2.0
AlphaVLLM Lumina Next SFT 8.67GB DiT 1.7B Gemma 2.5B VAE Apache 2.0
AlphaVLLM Lumina 2 20.75GB DiT 2.61B Gemma-2 2.61B 16ch VAE Apache 2.0
PixArt Alpha XL 2 21.3GB DiT 0.61B T5-XXL 4.76B VAE OpenRAIL
PixArt Sigma XL 2 21.3GB DiT 0.61B T5-XXL 4.76B VAE OpenRAIL
Segmind SSD-1B 8.72GB UNet 1.33B CLiP ViT-L + ViT+G 0.12B + 0.69B VAE Apache 2.0
Segmind Vega 6.43GB UNet 0.75B CLiP ViT-L + ViT+G 0.12B + 0.69B VAE Apache 2.0
Segmind Tiny 1.03GB UNet 0.32B CLiP ViT-L 0.12B VAE OpenRAIL
Thu-ML UniDiffuser v1 5.37GB U-ViT 0.95B CLiP ViT-L + CLiP ViT-B 0.12B + 0.16B VAE AGPL 3
Kwai Kolors 17.40GB UNnet 2.58B ChatGLM 6.24B VAE Apache 2.0
PlaygroundAI Playground 1.0 4.95GB UNet 0.86B CLiP ViT-L 0.12B VAE ?
PlaygroundAI Playground 2.x 13.35GB UNet 2.56B CLiP ViT-L + ViT+G 0.12B + 0.69B VAE Proprietary
Tencent HunyuanDiT 1.2 14.09GB DiT 1.5B BERT + T5-XL 3.52B + 1.67B VAE Proprietary
Warp AI Wuerstchen 12.16GB Multi-stage UNet 1.0B + 1.05B CLiP ViT-L + ViT+G 0.12B + 0.69B 42x VQE MIT
Kandinsky Kandinsky 2.1 5.15GB Unet 1.25B CLiP ViT-G 0.69B VQ Apache 2.0
Kandinsky Kandinsky 2.2 5.15GB Unet 1.25B CLiP ViT-G 0.69B VQ Apache 2.0
Kandinsky Kandinsky 3.0 27.72GB Unet 3.05B T5-XXXL 8.72B VQ Apache 2.0
Thudm CogView 3 Plus 24.96GB DiT 2.85B T5-XXL 4.76B VAE Apache 2.0
Thudm CogView 4 30.39GB DiT 6.37B GLM-4 9.40B VAE Apache 2.0
IDKiro SDXS 2.05GB UNet 0.32B CLiP ViT-L 0.12B VAE OpenRAIL
Open-MUSE aMUSEd 256 3.41GB ViT 0.60B CLiP ViT-L 0.12B VQ OpenRAIL
Koala Koala 700M 6.58GB UNet 0.78B CLiP ViT-L + ViT+G 0.12B + 0.69B VAE Proprietary
Thu-ML UniDiffuser v1 5.37GB U-ViT 0.95B CLiP ViT-L + CLiP ViT-B 0.12B + 0.16B VAE aGPL v3
Salesforce BLIP-Diffusion 7.23GB UNet 0.86B CLiP ViT-L + BLiP-2 0.12B + 0.49B VAE BSD 3
DeepFloyd IF M 12.79GB Multi-stage UNet 0.37B + 0.46B T5-XXL 4.76B Pixel Proprietary
DeepFloyd IF L 15.48GB Multi-stage UNet 0.61B + 0.93B T5-XXL 4.76B Pixel Proprietary
MeissonFlow Meissonic 3.64GB DiT 1.18B CLiP ViT-H 0.35B VQ Apache 2.0
VectorSpaceLab OmniGen v1 15.47GB Transformer 3.76B Phi-3 0 VAE MIT
VectorSpaceLab OmniGen v2 30.50GB Transformer 3.97B Qwen-VL-2.5 3.75B VAE Apache 2.0
HiDream-AI HiDream I1 Fast/Dev/Full 42.71 GB + 15.69 MMDiT 17.10B CLiP ViT-L + ViT+G + T5-XXL + LLama-3.1-8B 0.12B + 0.69B + 2.95B + 4.54B 16ch VAE MIT
Wan-AI WAN 2.1 1.3B 27.72GB MMDiT 1.42xB UMT5-XXL 5.68B 16ch VAE Apache 2.0
Wan-AI WAN 2.1 14B 78.52GB MMDiT B UMT5-XXL 14.28B 16ch VAE Apache 2.0
Bria Bria 3.2 18.66GB MMDiT 3.78B T5-XXL 4.76B 16ch VAE Proprietary
Qwen Qwen-Image 56.10GB MMDiT 20.43B QWen-2.5 8.29B Apache 2.0

Notes

  • Created using SD.Next built-in model analyzer
  • Number of parameters is proportional to model complexity and ability to learn
    Quality of generated images is also influenced by training data and duration of training
  • Size refers to original model variant in 16bit precision where available
    Quantized variations may be smaller
  • Distilled variants are not included as typical goal-distilling does not change underlying model params
    e.g. Turbo/LCM/Hyper/Lightning/etc. or even Dev/Schnell