Stable Diffusion 3.x
StabilityAI's Stable Diffusion 3 family consists of:
- Stable Diffusion 3.0 Medium
- Stable Diffusion 3.5 Medium
- Stable Diffusion 3.5 Large
- Stable Diffusion 3.5 Large Turbo
Important
Allow gated access
This is a gated model; you must accept terms and conditions to use it.
For more information see Gated Access Wiki
[!IMPORTANT]
Set offloading
Set an appropriate offloading mode before loading the model to avoid out-of-memory errors.
For more information see Offloading Wiki
[!IMPORTANT]
Choose quantization
Check compatibility of different quantizations with your platform and GPU!
For more information see Quantization Wiki
[!WARNING] Regardless of offloading settings, the full model must load into RAM before use. Check total model size in the table and make sure your system has enough RAM.
[!TIP] Use reference models
Use of reference models is recommended over manually downloaded models.
Simply select it from Networks -> Models -> Reference
and model will be auto-downloaded on first use.
Components
SD3.x model consists of:
- Unet/Transformer: MMDiT
- Text encoder 1: CLIP-ViT/L,
- Text encoder 2: OpenCLIP-ViT/G,
- Text encoder 3: T5-XXL Version 1.1
- VAE
When using reference models, all components are loaded as needed.
If using a manually downloaded model, ensure all components are correctly configured and available.
Note that most available downloads are not all-in-one models; they are individual components of the full model.
Important
Do not attempt to assemble a full model by loading all individual components
That may be how some other apps are designed to work, but it is not how SD.Next works.
Always load a full model first, then replace individual components as needed.
[!WARNING]
If you're getting error message during model load: file=xxx is not a complete model
It means exactly that: you are trying to load a model component instead of a full model.
[!TIP]
For convenience, you can add settings that allow quick replacement of model components to your
quicksettings by adding Settings -> User Interface -> Quicksettings list -> sd_unet, sd_vae, sd_text_encoder.
Fine-tunes
Diffusers
N/A: Currently there are no known diffusers fine-tunes of SD3.0 or SD3.5 models
LoRAs
SD.Next includes support for SD3 LoRAs
Since LoRA keys vary significantly across training tools and LoRA types,
support for additional LoRAs will be added as needed. Please report any non-functional LoRAs.
Also note that LoRA compatibility depends on quantization type.
If you have issues loading LoRA, try switching your SD3 base model to a different quantization type.
All-in-one
Since text encoders and VAE are the same across SD3 models, using all-in-one safetensors is not recommended due to large data duplication.
Unet/Transformer
Unet/Transformer component is a typical model fine-tune and is around 11GB in size
To load a Unet/Transformer safetensors file:
- Download
safetensorsorgguffile from desired source and place it inmodels/UNETfolder - Load model as usual and then
- Replace transformer with one in desired safetensors file using:
Settings -> Execution & Models -> UNet
Text Encoder
SD.Next allows changing optional text encoders on-the-fly.
Go to Settings -> Models -> Text encoder and select the desired text encoder.
T5 enhances text rendering and some details, but it is otherwise lightly used and optional.
Loading a lighter T5 can greatly reduce model resource usage, but may not be compatible with all offloading modes.
VAE
SD.Next allows changing the VAE used by SD3 on-the-fly.
There are no alternative VAE models released, so this setting is mostly for future use.
Tip
To enable image previews during generate, set Settings -> Live Preview -> Method to TAESD**
To further speed up generation, you can disable "full quality" which triggers use of TAESD instead of full VAE to decode final image
Scheduler
Due to the specifics of flow-matching methods, the number of steps also has a strong influence on image composition, not just on how the image is resolved.