Skip to content

Video

SD.Next supports video creation using the top-level Video tab.
Support includes T2V: text-to-video and I2V: image-to-video.

Tip

Latest video models use LLMs for prompting and therefore require very long, descriptive prompts.

Supported models

SD.Next supports the following models out of the box:

Note

All models are auto-downloaded on first use.
Download location is defined in: Settings -> System paths -> Huggingface.

[!NOTE] Optimized support for FramePack based on HunyuanVideo-I2V is implemented in a separate tab
FramePack supports high-quality generation with near-unlimited duration and limited VRAM.

[!NOTE] Support for LTXVideo is implemented in a separate tab
LTXVideo supports flexible guidance with text, image, and video prompts, with optional upsampling and refining.

Reference list

Engine Model Type Size Optimal Resolution Default Sampler Reference Values Special Notes License
Hunyuan HunyuanVideo T2V 40.9GB 1280x720 Euler FlowMatch Frames:129 CFG:6.0 Steps:50 Proprietary
Hunyuan HunyuanVideo I2V 59.2GB 1280x720 Euler FlowMatch Frames:129 CFG:1.0 Steps:50 Proprietary
HunyuanVideo FramePack T2V/I2V/FLF2V 25.0GB+15GB 608x640 UniPC FlowMatch Frames:73 Steps:25
Hunyuan FastHunyuan T2V 25.0GB+15GB 1280x720 Euler FlowMatch Frames:125 CFG:6.0 True:1.0 Shift:17 Steps:6
Hunyuan SkyReels v1 T2V 25.0GB+15GB 960x544 Euler FlowMatch Frames:97 CFG:1.0 True:6.0 Steps:50
Hunyuan SkyReels v1 I2V 25.0GB+15GB 960x544 Euler FlowMatch Frames:97 CFG:1.0 True:6.0 Steps:50
WAN21 WAN 2.1 1.3B T2V 28.2GB 832x480 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
WAN21 WAN 2.1 14B T2V 78.1GB 1280x720 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
WAN21 WAN 2.1 14B 480p I2V 832x480 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
WAN21 WAN 2.1 14B 720p I2V 1280x720 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
WAN21 WAN 2.1 14B 720p FLF2V 1280x720 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
WAN21 WAN 2.2 5B T2V/I2V 1280x720 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
WAN21 WAN 2.2 A14B T2V/I2V 1280x720 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
WAN21 WAN 2.2 14B VACE T2V/I2V 1280x720 UniPC Frames:81 CFG:5.0 Steps:50 Apache 2.0
LTXVideo LTXVideo 0.9.0 T2V 704x480 Euler FlowMatch Frames:161 Steps:50 Proprietary
LTXVideo LTXVideo 0.9.0 I2V 704x480 Euler FlowMatch Frames:161 Steps:50 Proprietary
LTXVideo LTXVideo 0.9.1 T2V 24.1GB 704x512 Euler FlowMatch Frames:161 CFG:3 Steps:50 Proprietary
LTXVideo LTXVideo 0.9.1 I2V 24.1GB 704x512 Euler FlowMatch Frames:161 CFG:3 Steps:50 Proprietary
LTXVideo LTXVideo 0.9.5 T2V 24.8GB 768x512 Euler FlowMatch Frames:161 Steps:40 Proprietary
LTXVideo LTXVideo 0.9.5 I2V 768x512 Euler FlowMatch Frames:161 Steps:40 Proprietary
LTXVideo LTXVideo 0.9.6 2B T2V 768x512 Euler FlowMatch Frames:161 Steps:50 Proprietary
LTXVideo LTXVideo 0.9.6 2B Distilled T2V 768x512 Euler FlowMatch Frames:161 Steps:8 Proprietary
LTXVideo LTXVideo 0.9.7 13B T2V/I2V/V2V 46.3GB 768x512 Euler FlowMatch Frames:161 Steps:50 Proprietary
LTXVideo LTXVideo 0.9.8 13B T2V/I2V/V2V 46.3GB 768x512 Euler FlowMatch Frames:161 Steps:50 Proprietary
LTXVideo LTXVideo 2.0 19B T2V/I2V/V2V 63.0GB 768x512 Euler FlowMatch Frames:161 Steps:50 Proprietary
LTXVideo LTXVideo 2.3 22B T2V/I2V/V2V 71.7GB 768x512 Euler FlowMatch Frames:161 Steps:50 Proprietary
CogVideoX CogVideoX 1.0 2B T2V 720x480 Cog DDIM Frames:49 CFG:6.0 Steps:50 Apache 2.0
CogVideoX CogVideoX 1.0 5B T2V 720x480 Cog DDIM Frames:49 CFG:6.0 Steps:50 Proprietary
CogVideoX CogVideoX 1.0 5B I2V 720x480 Cog DDIM Frames:49 CFG:6.0 Steps:50 Proprietary
CogVideoX CogVideoX 1.5 5B T2V 30.3GB 1360x768 Cog DDIM Frames:81 CFG:6.0 Steps:50 Issue: blank output Proprietary
CogVideoX CogVideoX 1.5 5B I2V 1360x768 Cog DDIM Frames:81 CFG:6.0 Steps:50 Issue: blank output Proprietary
Allegro Allegro T2V 24.7GB 1280x720 Euler a Frames:88 CFG:7.5 Steps=100 Issue: blank output Apache 2.0
Mochi Mochi1 T2V 23.4GB 512x512 Euler FlowMatch Frames:16 CFG:7.5 Steps:50 Apache 2.0
Latte Latte1 T2V 23.4GB 512x512 DDIM Frames:16 CFG:7.5 Steps:50 Apache 2.0
Kandinsky Kandinsky 5 Lite T2V 23.4GB 768x512 Euler FlowMatch Frames:121 Steps:50 CFG:5.0 Apache 2.0
Kandinsky Kandinsky 5 Lite CFG-Distilled T2V 23.4GB 768x512 Euler FlowMatch Frames:121 Steps:50 CFG:1.0 Apache 2.0
Kandinsky Kandinsky 5 Lite Steps-Distilled T2V 23.4GB 768x512 Euler FlowMatch Frames:121 Steps:16 CFG:1.0 Apache 2.0

Tip

Each model may require specific resolutions and parameters for best results.
This includes advanced parameters such as Sampler shift, which are usually not critical for text-to-image.
See each model's original notes for recommended settings.

[!NOTE] It is recommended to use Default sampler unless you need a model-specific setting.
For example, to change Sampler Shift, select the matching sampler for that model.

Legacy models

Additional video models are available as individually selectable scripts in text or image interfaces.

LoRA

SD.Next includes LoRA support for Hunyuan, LTX, WAN, Mochi, Cog.

See LoRA for more details.

Optimizations

Warning

Any use on GPUs below 16GB and systems below 48GB RAM is experimental

Memory

Offloading helps by moving data between system RAM and GPU VRAM as needed.
However, the full model still must load into RAM before use.
Check total model size in the table and confirm your system has enough RAM.

Offloading

Enable offloading so model components can move in and out of VRAM as needed.
Most models support all offloading modes:

  • Balanced: recommended, but may require extra tuning
  • Model: simplest
  • Sequential: highest memory savings, but slowest

See Offload for more details

Quantization

Enable on-the-fly quantization during load in Settings -> Quantization for additional memory savings.

  • SDNQ
  • BnB
  • Optimum-Quanto
  • TorchAO

You can enable quantization for Transformers and Text Encoder together or separately.

  • Most T2V and I2V models support on-the-fly quantization of the transformer module.
  • Most T2V models support text-encoder quantization, while many I2V models do not because image vectors cannot be quantized in the same way.

See Quantization for more details

Decoding

Instead of using only the full VAE packaged with the model to decode final frames, SD.Next also supports Tiny VAE and Remote VAE for video decoding.

  • Tiny VAE: support for Hunyuan, WAN, Mochi
  • Remote VAE: support for Hunyuan

See VAE for more details

Processing

SD.Next supports two optional acceleration methods:

Interpolation

All video modules support optional frame interpolation for smoother output.
When enabled, interpolation uses RIFE (Real-Time Intermediate Flow Estimation).

Issues/Limitations

See TODO for known issues and limitations.