Black Forest Labs FLUX.1

The FLUX.1 family has three variants: - Pro
Model weights are NOT released, model is available only via Black Forest Labs
- Dev
Open-weight, guidance-distilled from Pro variation, available for non-commercial applications
- Schnell
Open-weight, timestep-distilled from Dev variation, available under Apache2.0 license

screenshot-modernui-f1

Important

Allow gated access
This is a gated model, you need to accept the terms and conditions to use it
For more information see Gated Access Wiki

Important

Set offloading
Set appropriate offloading setting before loading the model to avoid out-of-memory errors
For more information see Offloading Wiki

Important

Choose quantization
Check compatibility of different quantizations with your platform and GPU!
For more information see Quantization Wiki

Tip

Use reference models
Use of reference models is recommended over manually downloaded models!
Simply select it from Networks -> Models -> Reference
and the model will be auto-downloaded on first use

Important

Do not attempt to assemble a full model by loading all individual components
That may be how some other apps are designed to work, but it is not how SD.Next works.
Always load full model and then replace individual components as needed

Warning

If you're getting error message during model load: file=xxx is not a complete model
It means exactly that - you're trying to load a model component instead of full model

Components

FLUX.1 models include: - UNet/Transformer: MMDiT - Text encoder 1: CLIP-ViT/L, - Text encoder 2: T5-XXL Version 1.1
- VAE

When using reference models, components load automatically as needed. If you use manually downloaded models, make sure all required components are configured and available. Most available downloads are not all-in-one models; they are usually individual components.

Tip

For convenience, you can add settings that allow quick replacement of model components
to your quicksettings by adding
Settings -> User Interface -> Quicksettings list -> sd_model_checkpoint, sd_unet, sd_vae, sd_text_encoder

Fine-tunes

Diffusers

Many unofficial FLUX.1 variants are already available. Any Diffusers-based variant can be downloaded in SD.Next from Models -> Huggingface -> Download. Example: a Dev/Schnell merge by sayakpaul: sayakpaul/FLUX.1-merged

LoRAs

SD.Next includes support for FLUX.1 LoRAs

LoRA key formats vary across training tools and LoRA types. Additional compatibility support is added as needed, so please report non-working LoRAs.

LoRA compatibility also depends on quantization type. If loading fails, try a different FLUX.1 base model quantization.

All-in-one

Typical all-in-one safetensors files are over 20 GB and include transformer, both text encoders, and VAE. Since text encoders and VAE are shared across FLUX.1 variants, all-in-one safetensors are generally not recommended due to duplicated data.

Unet/Transformer

UNet/Transformer fine-tunes for FLUX.1 are typically around 11 GB.

To load a UNet/Transformer safetensors file: 1. Download safetensors or gguf file from desired source and place it in models/UNET folder
example: FastFlux Unchained
2. Load FLUX.1 model as usual and then
3. Replace transformer with one in desired safetensors file using:
Settings -> Execution & Models -> UNet

Text Encoder

SD.Next allows changing the optional text encoder on-the-fly.

Go to Settings -> Models -> Text encoder and select the desired text encoder
T5 can improve text rendering and some fine details, but it is optional. Using a lighter T5 can reduce resource usage, but may not be compatible with all offloading modes.

Tip

To use prompt attention syntax with FLUX.1, set Settings -> Execution -> Prompt attention to xhinker**

Example image with different encoder quantization options
flux-encoder

VAE

SD.Next allows changing the VAE used by FLUX.1 on-the-fly. There are currently no alternative VAE releases, so this setting is mostly future-facing.

Tip

To enable image previews during generate, set Settings -> Live Preview -> Method to TAESD**
To further speed up generation, you can disable "full quality" which triggers use of TAESD instead of full VAE to decode final image

Scheduler

With flow-matching methods, step count strongly affects image composition, not only final refinement quality.

Example image at different steps
flux-steps

You can also tune the sampler with the shift parameter, which changes how long the model spends on composition versus diffusion.

Example image with different sampler shift values flux-shift

ControlNet

Support for all InstantX/Shakker-Labs models including Union-Pro

FLUX.1 ControlNets are large at over 6GB on top of already very large FLUX.1 model
As a result, you may need offloading:sequential, which is slower but uses much less memory.

When using the Union model, also select a control mode in the Control unit.

Flux Tools

Link to Flux Tools announcement
- Redux is actually a tool
- Fill is inpaint/outpaint optimized version of Flux-dev
- Canny/Depth are optimized Flux-dev variants for their tasks: they are not ControlNets that run on top of another model

To use them, open the image or control interface and select Flux Tools in scripts. All models are auto-downloaded on first use. note: All models are gated and require acceptance of terms and conditions via web page
recommended: Enable on-the-fly quantization to reduce resource usage
- Redux: ~0.1GB
works together with existing model and basically uses input image to analyze it and use that instead of prompt
recommended: low denoise strength levels result in more variety
- Fill: ~23.8GB, replaces currently loaded model
note: can be used in inpaint/outpaint mode only
- Canny: ~23.8GB, replaces currently loaded model
recommended: guidance scale 30
- Depth: ~23.8GB, replaces currently loaded model
recommended: guidance scale 10

Notes

Performance

Performance and memory usage of different FLUX.1 variations:

dtype	time (sec)	performance	memory	offload	note
bf16			>32 GB	none	*1
bf16	50.47	0.40 it/s		balanced	*2
bf16	94.28	0.21 it/s	1.89 GB	sequential
nf4	14.69	1.36 it/s	17.92 GB	none
nf4	21.02	0.95 it/s		balanced	*2
nf4				sequential	err
qint8	15.42	1.30 it/s	18.85 GB	none
qint8				balanced	err
qint8				sequential	err
qint4	18.37	1.09 it/s	11.38 GB	none
qint4				balanced	err
qint4				sequential	err

Notes: - 1: Memory usage exceeds 32 GB and is not recommended. - 2: Balanced offload VRAM usage is not included because it depends on the selected threshold.