Video

SD.Next supports video creation using the top-level Video tab.
Support includes T2V: text-to-video and I2V: image-to-video.

Tip

Latest video models use LLMs for prompting and therefore require very long, descriptive prompts.

Supported models

SD.Next supports the following models out of the box:

Hunyuan: HunyuanVideo, FastHunyuan, SkyReels | T2V, I2V
WAN21: 1.3B, 14B | T2V, I2V, FLF2V
LTXVideo: 0.9.0, 0.9.1, 0.9.5, 0.9.6, 0.9.8, 2.0, 2.3 | T2V, I2V
CogVideoX: 2B, 5B | T2V, I2V
Allegro: T2V
Mochi1: T2V
Latte1: T2V
FramePack: I2V, FLF2V

Note

All models are auto-downloaded on first use.
Download location is defined in: Settings -> System paths -> Huggingface.

[!NOTE] Optimized support for FramePack based on HunyuanVideo-I2V is implemented in a separate tab
FramePack supports high-quality generation with near-unlimited duration and limited VRAM.

[!NOTE] Support for LTXVideo is implemented in a separate tab
LTXVideo supports flexible guidance with text, image, and video prompts, with optional upsampling and refining.

Reference list

Engine	Model	Type	Size	Optimal Resolution	Default Sampler	Reference Values	Special Notes	License
Hunyuan	HunyuanVideo	T2V	40.9GB	1280x720	Euler FlowMatch	Frames:129 CFG:6.0 Steps:50		Proprietary
Hunyuan	HunyuanVideo	I2V	59.2GB	1280x720	Euler FlowMatch	Frames:129 CFG:1.0 Steps:50		Proprietary
HunyuanVideo	FramePack	T2V/I2V/FLF2V	25.0GB+15GB	608x640	UniPC FlowMatch	Frames:73 Steps:25
Hunyuan	FastHunyuan	T2V	25.0GB+15GB	1280x720	Euler FlowMatch	Frames:125 CFG:6.0 True:1.0 Shift:17 Steps:6
Hunyuan	SkyReels v1	T2V	25.0GB+15GB	960x544	Euler FlowMatch	Frames:97 CFG:1.0 True:6.0 Steps:50
Hunyuan	SkyReels v1	I2V	25.0GB+15GB	960x544	Euler FlowMatch	Frames:97 CFG:1.0 True:6.0 Steps:50
WAN21	WAN 2.1 1.3B	T2V	28.2GB	832x480	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
WAN21	WAN 2.1 14B	T2V	78.1GB	1280x720	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
WAN21	WAN 2.1 14B 480p	I2V		832x480	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
WAN21	WAN 2.1 14B 720p	I2V		1280x720	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
WAN21	WAN 2.1 14B 720p	FLF2V		1280x720	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
WAN21	WAN 2.2 5B	T2V/I2V		1280x720	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
WAN21	WAN 2.2 A14B	T2V/I2V		1280x720	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
WAN21	WAN 2.2 14B VACE	T2V/I2V		1280x720	UniPC	Frames:81 CFG:5.0 Steps:50		Apache 2.0
LTXVideo	LTXVideo 0.9.0	T2V		704x480	Euler FlowMatch	Frames:161 Steps:50		Proprietary
LTXVideo	LTXVideo 0.9.0	I2V		704x480	Euler FlowMatch	Frames:161 Steps:50		Proprietary
LTXVideo	LTXVideo 0.9.1	T2V	24.1GB	704x512	Euler FlowMatch	Frames:161 CFG:3 Steps:50		Proprietary
LTXVideo	LTXVideo 0.9.1	I2V	24.1GB	704x512	Euler FlowMatch	Frames:161 CFG:3 Steps:50		Proprietary
LTXVideo	LTXVideo 0.9.5	T2V	24.8GB	768x512	Euler FlowMatch	Frames:161 Steps:40		Proprietary
LTXVideo	LTXVideo 0.9.5	I2V		768x512	Euler FlowMatch	Frames:161 Steps:40		Proprietary
LTXVideo	LTXVideo 0.9.6 2B	T2V		768x512	Euler FlowMatch	Frames:161 Steps:50		Proprietary
LTXVideo	LTXVideo 0.9.6 2B Distilled	T2V		768x512	Euler FlowMatch	Frames:161 Steps:8		Proprietary
LTXVideo	LTXVideo 0.9.7 13B	T2V/I2V/V2V	46.3GB	768x512	Euler FlowMatch	Frames:161 Steps:50		Proprietary
LTXVideo	LTXVideo 0.9.8 13B	T2V/I2V/V2V	46.3GB	768x512	Euler FlowMatch	Frames:161 Steps:50		Proprietary
LTXVideo	LTXVideo 2.0 19B	T2V/I2V/V2V	63.0GB	768x512	Euler FlowMatch	Frames:161 Steps:50		Proprietary
LTXVideo	LTXVideo 2.3 22B	T2V/I2V/V2V	71.7GB	768x512	Euler FlowMatch	Frames:161 Steps:50		Proprietary
CogVideoX	CogVideoX 1.0 2B	T2V		720x480	Cog DDIM	Frames:49 CFG:6.0 Steps:50		Apache 2.0
CogVideoX	CogVideoX 1.0 5B	T2V		720x480	Cog DDIM	Frames:49 CFG:6.0 Steps:50		Proprietary
CogVideoX	CogVideoX 1.0 5B	I2V		720x480	Cog DDIM	Frames:49 CFG:6.0 Steps:50		Proprietary
CogVideoX	CogVideoX 1.5 5B	T2V	30.3GB	1360x768	Cog DDIM	Frames:81 CFG:6.0 Steps:50	Issue: blank output	Proprietary
CogVideoX	CogVideoX 1.5 5B	I2V		1360x768	Cog DDIM	Frames:81 CFG:6.0 Steps:50	Issue: blank output	Proprietary
Allegro	Allegro	T2V	24.7GB	1280x720	Euler a	Frames:88 CFG:7.5 Steps=100	Issue: blank output	Apache 2.0
Mochi	Mochi1	T2V	23.4GB	512x512	Euler FlowMatch	Frames:16 CFG:7.5 Steps:50		Apache 2.0
Latte	Latte1	T2V	23.4GB	512x512	DDIM	Frames:16 CFG:7.5 Steps:50		Apache 2.0
Kandinsky	Kandinsky 5 Lite	T2V	23.4GB	768x512	Euler FlowMatch	Frames:121 Steps:50 CFG:5.0		Apache 2.0
Kandinsky	Kandinsky 5 Lite CFG-Distilled	T2V	23.4GB	768x512	Euler FlowMatch	Frames:121 Steps:50 CFG:1.0		Apache 2.0
Kandinsky	Kandinsky 5 Lite Steps-Distilled	T2V	23.4GB	768x512	Euler FlowMatch	Frames:121 Steps:16 CFG:1.0		Apache 2.0

Tip

Each model may require specific resolutions and parameters for best results.
This includes advanced parameters such as Sampler shift, which are usually not critical for text-to-image.
See each model's original notes for recommended settings.

[!NOTE] It is recommended to use Default sampler unless you need a model-specific setting.
For example, to change Sampler Shift, select the matching sampler for that model.

Legacy models

Additional video models are available as individually selectable scripts in text or image interfaces.

Stable Video Diffusion, Base, XY 1.0 and XT 1.1
VGen
AnimateDiff

LoRA

SD.Next includes LoRA support for Hunyuan, LTX, WAN, Mochi, Cog.

See LoRA for more details.

Optimizations

Warning

Any use on GPUs below 16GB and systems below 48GB RAM is experimental

Memory

Offloading helps by moving data between system RAM and GPU VRAM as needed.
However, the full model still must load into RAM before use.
Check total model size in the table and confirm your system has enough RAM.

Offloading

Enable offloading so model components can move in and out of VRAM as needed.
Most models support all offloading modes:

Balanced: recommended, but may require extra tuning
Model: simplest
Sequential: highest memory savings, but slowest

See Offload for more details

Quantization

Enable on-the-fly quantization during load in Settings -> Quantization for additional memory savings.

SDNQ
BnB
Optimum-Quanto
TorchAO

You can enable quantization for Transformers and Text Encoder together or separately.

Most T2V and I2V models support on-the-fly quantization of the transformer module.
Most T2V models support text-encoder quantization, while many I2V models do not because image vectors cannot be quantized in the same way.

See Quantization for more details

Decoding

Instead of using only the full VAE packaged with the model to decode final frames, SD.Next also supports Tiny VAE and Remote VAE for video decoding.

Tiny VAE: support for Hunyuan, WAN, Mochi
Remote VAE: support for Hunyuan

See VAE for more details

Processing

SD.Next supports two optional acceleration methods:

FasterCache
support for Hunyuan, Mochi, Latte, Allegro, Cog, WanDB, LTX
PyramidAttentionBroadcast
support for Hunyuan, Mochi, Latte, Allegro, Cog, WanDB, LTX

Interpolation

All video modules support optional frame interpolation for smoother output.
When enabled, interpolation uses RIFE (Real-Time Intermediate Flow Estimation).

Issues/Limitations

See TODO for known issues and limitations.