Stable Diffusion 3.x

StabilityAI's Stable Diffusion 3 family consists of:

screenshot-modernui-sd3

Important

Allow gated access
This is a gated model; you must accept terms and conditions to use it.
For more information see Gated Access Wiki

[!IMPORTANT] Set offloading
Set an appropriate offloading mode before loading the model to avoid out-of-memory errors.
For more information see Offloading Wiki

[!IMPORTANT] Choose quantization
Check compatibility of different quantizations with your platform and GPU!
For more information see Quantization Wiki

[!WARNING] Regardless of offloading settings, the full model must load into RAM before use. Check total model size in the table and make sure your system has enough RAM.

[!TIP] Use reference models
Use of reference models is recommended over manually downloaded models.
Simply select it from Networks -> Models -> Reference
and model will be auto-downloaded on first use.

Components

SD3.x model consists of:

Unet/Transformer: MMDiT
Text encoder 1: CLIP-ViT/L,
Text encoder 2: OpenCLIP-ViT/G,
Text encoder 3: T5-XXL Version 1.1
VAE

When using reference models, all components are loaded as needed.
If using a manually downloaded model, ensure all components are correctly configured and available.
Note that most available downloads are not all-in-one models; they are individual components of the full model.

Important

Do not attempt to assemble a full model by loading all individual components
That may be how some other apps are designed to work, but it is not how SD.Next works.
Always load a full model first, then replace individual components as needed.

[!WARNING] If you're getting error message during model load: file=xxx is not a complete model
It means exactly that: you are trying to load a model component instead of a full model.

[!TIP] For convenience, you can add settings that allow quick replacement of model components to your
quicksettings by adding Settings -> User Interface -> Quicksettings list -> sd_unet, sd_vae, sd_text_encoder.

Fine-tunes

Diffusers

N/A: Currently there are no known diffusers fine-tunes of SD3.0 or SD3.5 models

LoRAs

SD.Next includes support for SD3 LoRAs

Since LoRA keys vary significantly across training tools and LoRA types,
support for additional LoRAs will be added as needed. Please report any non-functional LoRAs.

Also note that LoRA compatibility depends on quantization type.
If you have issues loading LoRA, try switching your SD3 base model to a different quantization type.

All-in-one

Since text encoders and VAE are the same across SD3 models, using all-in-one safetensors is not recommended due to large data duplication.

Unet/Transformer

Unet/Transformer component is a typical model fine-tune and is around 11GB in size

To load a Unet/Transformer safetensors file:

Download safetensors or gguf file from desired source and place it in models/UNET folder
Load model as usual and then
Replace transformer with one in desired safetensors file using:
Settings -> Execution & Models -> UNet

Text Encoder

SD.Next allows changing optional text encoders on-the-fly.

Go to Settings -> Models -> Text encoder and select the desired text encoder.
T5 enhances text rendering and some details, but it is otherwise lightly used and optional.
Loading a lighter T5 can greatly reduce model resource usage, but may not be compatible with all offloading modes.

VAE

SD.Next allows changing the VAE used by SD3 on-the-fly.
There are no alternative VAE models released, so this setting is mostly for future use.

Tip

To enable image previews during generate, set Settings -> Live Preview -> Method to TAESD**

To further speed up generation, you can disable "full quality" which triggers use of TAESD instead of full VAE to decode final image

Scheduler

Due to the specifics of flow-matching methods, the number of steps also has a strong influence on image composition, not just on how the image is resolved.