Video

SD.Next supports video creation using top-level Video tab
Supoport includes T2V: text-to-video and I2V: image-to-video

Tip

Latest video models use LLMs for prompting and due to that requires very long and descriptive prompt

Supported models

SD.Next supports following models out-of-the-box:

Hunyuan: HunyuanVideo, FastHunyuan, SkyReels | T2V, I2V
WAN21: 1.3B, 14B | T2V, I2V, FLF2V
LTXVideo: 0.9.0, 0.9.1, 0.9.5, 0.9.6 | T2V, I2V
CogVideoX: 2B, 5B | T2V, I2V
Allegro: T2V
Mochi1: T2V
Latte1: T2V
FramePack: I2V, FLF2V

Note

All models are auto-downloaded upon first use
Download location uses folder specificed by: Settings -> System paths -> Huggingface

Note

Optimized support for FramePack based on HunyuanVideo-I2V is implemented in a separate tab
FramePack allows generation high-quality videos with pretty much unlimited duration and with limited VRAM!

Note

Support for LTXVideo is implemented in a separate tab
LTXVideo allows flexible video guidance using text, image and video prompts with optional upsampling and refining!

Reference list

Engine	Model	Type	Size	Optimal Resolution	Default Sampler	Reference Values	Special Notes
Hunyuan	HunyuanVideo	T2V	40.9GB	1280x720	Euler FlowMatch	Frames:129 CFG:6.0 Steps:50
Hunyuan	HunyuanVideo	I2V	59.2GB	1280x720	Euler FlowMatch	Frames:129 CFG:1.0 Steps:50
HunyuanVideo	FramePack	T2V/I2V/FLF2V	25.0GB+15GB	608x640	UniPC FlowMatch	Frames:73 Steps:25
Hunyuan	FastHunyuan	T2V	25.0GB+15GB	1280x720	Euler FlowMatch	Frames:125 CFG:6.0 True:1.0 Shift:17 Steps:6
Hunyuan	SkyReels	T2V	25.0GB+15GB	960x544	Euler FlowMatch	Frames:97 CFG:1.0 True:6.0 Steps:50
Hunyuan	SkyReels	I2V	25.0GB+15GB	960x544	Euler FlowMatch	Frames:97 CFG:1.0 True:6.0 Steps:50
WAN21	WAN 2.1 1.3B	T2V	28.2GB	832x480	UniPC	Frames:81 CFG:5.0 Steps:50
WAN21	WAN 2.1 14B	T2V	78.1GB	1280x720	UniPC	Frames:81 CFG:5.0 Steps:50
WAN21	WAN 2.1 14B 480p	I2V		832x480	UniPC	Frames:81 CFG:5.0 Steps:50
WAN21	WAN 2.1 14B 720p	I2V		1280x720	UniPC	Frames:81 CFG:5.0 Steps:50
WAN21	WAN 2.1 14B 720p	FLF2V		1280x720	UniPC	Frames:81 CFG:5.0 Steps:50
LTXVideo	LTXVideo 0.9.0	T2V		704x480	Euler FlowMatch	Frames:161 Steps:50
LTXVideo	LTXVideo 0.9.0	I2V		704x480	Euler FlowMatch	Frames:161 Steps:50
LTXVideo	LTXVideo 0.9.1	T2V	24.1GB	704x512	Euler FlowMatch	Frames:161 CFG:3 Steps:50
LTXVideo	LTXVideo 0.9.1	I2V	24.1GB	704x512	Euler FlowMatch	Frames:161 CFG:3 Steps:50
LTXVideo	LTXVideo 0.9.5	T2V	24.8GB	768x512	Euler FlowMatch	Frames:161 Steps:40
LTXVideo	LTXVideo 0.9.5	I2V		768x512	Euler FlowMatch	Frames:161 Steps:40
LTXVideo	LTXVideo 0.9.6 2B	T2V		768x512	Euler FlowMatch	Frames:161 Steps:50
LTXVideo	LTXVideo 0.9.6 2B Distilled	T2V		768x512	Euler FlowMatch	Frames:161 Steps:8
LTXVideo	LTXVideo 0.9.7 13B	T2V/I2V/V2V	46.3GB	768x512	Euler FlowMatch	Frames:161 Steps:50
CogVideoX	CogVideoX 1.0 2B	T2V		720x480	Cog DDIM	Frames:49 CFG:6.0 Steps:50
CogVideoX	CogVideoX 1.0 5B	T2V		720x480	Cog DDIM	Frames:49 CFG:6.0 Steps:50
CogVideoX	CogVideoX 1.0 5B	I2V		720x480	Cog DDIM	Frames:49 CFG:6.0 Steps:50
CogVideoX	CogVideoX 1.5 5B	T2V	30.3GB	1360x768	Cog DDIM	Frames:81 CFG:6.0 Steps:50	Issue: blank output
CogVideoX	CogVideoX 1.5 5B	I2V		1360x768	Cog DDIM	Frames:81 CFG:6.0 Steps:50	Issue: blank output
Allegro	Allegro	T2V	24.7GB	1280x720	Euler a	Frames:88 CFG:7.5 Steps=100	Issue: blank output
Mochi	Mochi1	T2V	23.4GB	512x512	Euler FlowMatch	Frames:16 CFG:7.5 Steps:50
Latte	Latte1	T2V	23.4GB	512x512	DDIM	Frames:16 CFG:7.5 Steps:50

Tip

Each model may require specific resolution or parameters to produce quality results
This also includes advanced paramters such as Sampler shift which would during normal text-to-image be considered not required to tweak
See individual model's original notes for recommendations on parameters

Note

Its recommended to use Default as sampler for all models unless you need to change specific sampler setting
For example, to change Sampler Shift, you need to select appropriate sampler for that model

Legacy models

Additional video models are available as individually selectable scripts in either text or image interfaces

Stable Video Diffusion, Base, XY 1.0 and XT 1.1
VGen
AnimateDiff

LoRA

SD.Next includes LoRA support for Hunyuan, LTX, WAN, Mochi, Cog

See LoRA for more details

Optimizations

Warning

Any use on GPUs below 16GB and systems below 48GB RAM is experimental

Memory

Offloading helps by moving data between system RAM and GPU VRAM memory as needed
However, there is no way around requirement that entire model must be loaded into RAM before it can be used
Look at the total model size in the table and make sure you have enough RAM to load the model

Offloading

Enable offloading so model components can be moved in and out of VRAM as needed
Most models support all offloading types:

Balanced: recommended, but may require extra tuning
Model: simplest
Sequential: highest memory savings, but slowest

See Offload for more details

Quantization

Enable on-the-fly quantization during load in Settings -> Quantization for additional memory savings

SDNQ
BnB
Optimum-Quanto
TorchAO

You can enable quantization for both or either Transformers and Text-Encoder separately

Most T2V and I2V models support on-the-fly quantization of transformers module
Most T2V support quantization of text-encoder while I2V model may not due to inability to quantize image vectors

See Quantization for more details

Decoding

Instead of using full VAE that is packaged with the model itself to decode final frames, SD.Next supports use of Tiny VAE as well as ability to use Remote VAE to decode video

Tiny VAE: support for Hunyuan, WAN, Mochi
Remote VAE: support for Hunyuan

See VAE for more details

Processing

SD.Next supports two types of optional processing acceleration:

FasterCache
support for Hunyuan, Mochi, Latte, Allegro, Cog, WanDB, LTX
PyramidAttentionBroadcast
support for Hunyuan, Mochi, Latte, Allegro, Cog, WanDB, LTX

Interpolation

For all video modules, SD.Next supports adding interpolated frames to video for smoother output
Interpolation (if enabled) is performed using RIFE Real-Time Intermediate Flow Estimation

Issues/Limitations

See TODO for known issues and limitations