Video
SD.Next supports video creation using the top-level Video tab.
Support includes T2V: text-to-video and I2V: image-to-video.
Tip
Latest video models use LLMs for prompting and therefore require very long, descriptive prompts.
Supported models
SD.Next supports the following models out of the box:
- Hunyuan: HunyuanVideo, FastHunyuan, SkyReels | T2V, I2V
- WAN21: 1.3B, 14B | T2V, I2V, FLF2V
- LTXVideo: 0.9.0, 0.9.1, 0.9.5, 0.9.6, 0.9.8, 2.0, 2.3 | T2V, I2V
- CogVideoX: 2B, 5B | T2V, I2V
- Allegro: T2V
- Mochi1: T2V
- Latte1: T2V
- FramePack: I2V, FLF2V
Note
All models are auto-downloaded on first use.
Download location is defined in: Settings -> System paths -> Huggingface.
[!NOTE]
Optimized support for FramePack based on HunyuanVideo-I2V is implemented in a separate tab
FramePack supports high-quality generation with near-unlimited duration and limited VRAM.
[!NOTE]
Support for LTXVideo is implemented in a separate tab
LTXVideo supports flexible guidance with text, image, and video prompts, with optional upsampling and refining.
Reference list
| Engine | Model | Type | Size | Optimal Resolution | Default Sampler | Reference Values | Special Notes | License |
|---|---|---|---|---|---|---|---|---|
| Hunyuan | HunyuanVideo | T2V | 40.9GB | 1280x720 | Euler FlowMatch | Frames:129 CFG:6.0 Steps:50 | Proprietary | |
| Hunyuan | HunyuanVideo | I2V | 59.2GB | 1280x720 | Euler FlowMatch | Frames:129 CFG:1.0 Steps:50 | Proprietary | |
| HunyuanVideo | FramePack | T2V/I2V/FLF2V | 25.0GB+15GB | 608x640 | UniPC FlowMatch | Frames:73 Steps:25 | ||
| Hunyuan | FastHunyuan | T2V | 25.0GB+15GB | 1280x720 | Euler FlowMatch | Frames:125 CFG:6.0 True:1.0 Shift:17 Steps:6 | ||
| Hunyuan | SkyReels v1 | T2V | 25.0GB+15GB | 960x544 | Euler FlowMatch | Frames:97 CFG:1.0 True:6.0 Steps:50 | ||
| Hunyuan | SkyReels v1 | I2V | 25.0GB+15GB | 960x544 | Euler FlowMatch | Frames:97 CFG:1.0 True:6.0 Steps:50 | ||
| WAN21 | WAN 2.1 1.3B | T2V | 28.2GB | 832x480 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | |
| WAN21 | WAN 2.1 14B | T2V | 78.1GB | 1280x720 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | |
| WAN21 | WAN 2.1 14B 480p | I2V | 832x480 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | ||
| WAN21 | WAN 2.1 14B 720p | I2V | 1280x720 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | ||
| WAN21 | WAN 2.1 14B 720p | FLF2V | 1280x720 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | ||
| WAN21 | WAN 2.2 5B | T2V/I2V | 1280x720 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | ||
| WAN21 | WAN 2.2 A14B | T2V/I2V | 1280x720 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | ||
| WAN21 | WAN 2.2 14B VACE | T2V/I2V | 1280x720 | UniPC | Frames:81 CFG:5.0 Steps:50 | Apache 2.0 | ||
| LTXVideo | LTXVideo 0.9.0 | T2V | 704x480 | Euler FlowMatch | Frames:161 Steps:50 | Proprietary | ||
| LTXVideo | LTXVideo 0.9.0 | I2V | 704x480 | Euler FlowMatch | Frames:161 Steps:50 | Proprietary | ||
| LTXVideo | LTXVideo 0.9.1 | T2V | 24.1GB | 704x512 | Euler FlowMatch | Frames:161 CFG:3 Steps:50 | Proprietary | |
| LTXVideo | LTXVideo 0.9.1 | I2V | 24.1GB | 704x512 | Euler FlowMatch | Frames:161 CFG:3 Steps:50 | Proprietary | |
| LTXVideo | LTXVideo 0.9.5 | T2V | 24.8GB | 768x512 | Euler FlowMatch | Frames:161 Steps:40 | Proprietary | |
| LTXVideo | LTXVideo 0.9.5 | I2V | 768x512 | Euler FlowMatch | Frames:161 Steps:40 | Proprietary | ||
| LTXVideo | LTXVideo 0.9.6 2B | T2V | 768x512 | Euler FlowMatch | Frames:161 Steps:50 | Proprietary | ||
| LTXVideo | LTXVideo 0.9.6 2B Distilled | T2V | 768x512 | Euler FlowMatch | Frames:161 Steps:8 | Proprietary | ||
| LTXVideo | LTXVideo 0.9.7 13B | T2V/I2V/V2V | 46.3GB | 768x512 | Euler FlowMatch | Frames:161 Steps:50 | Proprietary | |
| LTXVideo | LTXVideo 0.9.8 13B | T2V/I2V/V2V | 46.3GB | 768x512 | Euler FlowMatch | Frames:161 Steps:50 | Proprietary | |
| LTXVideo | LTXVideo 2.0 19B | T2V/I2V/V2V | 63.0GB | 768x512 | Euler FlowMatch | Frames:161 Steps:50 | Proprietary | |
| LTXVideo | LTXVideo 2.3 22B | T2V/I2V/V2V | 71.7GB | 768x512 | Euler FlowMatch | Frames:161 Steps:50 | Proprietary | |
| CogVideoX | CogVideoX 1.0 2B | T2V | 720x480 | Cog DDIM | Frames:49 CFG:6.0 Steps:50 | Apache 2.0 | ||
| CogVideoX | CogVideoX 1.0 5B | T2V | 720x480 | Cog DDIM | Frames:49 CFG:6.0 Steps:50 | Proprietary | ||
| CogVideoX | CogVideoX 1.0 5B | I2V | 720x480 | Cog DDIM | Frames:49 CFG:6.0 Steps:50 | Proprietary | ||
| CogVideoX | CogVideoX 1.5 5B | T2V | 30.3GB | 1360x768 | Cog DDIM | Frames:81 CFG:6.0 Steps:50 | Issue: blank output | Proprietary |
| CogVideoX | CogVideoX 1.5 5B | I2V | 1360x768 | Cog DDIM | Frames:81 CFG:6.0 Steps:50 | Issue: blank output | Proprietary | |
| Allegro | Allegro | T2V | 24.7GB | 1280x720 | Euler a | Frames:88 CFG:7.5 Steps=100 | Issue: blank output | Apache 2.0 |
| Mochi | Mochi1 | T2V | 23.4GB | 512x512 | Euler FlowMatch | Frames:16 CFG:7.5 Steps:50 | Apache 2.0 | |
| Latte | Latte1 | T2V | 23.4GB | 512x512 | DDIM | Frames:16 CFG:7.5 Steps:50 | Apache 2.0 | |
| Kandinsky | Kandinsky 5 Lite | T2V | 23.4GB | 768x512 | Euler FlowMatch | Frames:121 Steps:50 CFG:5.0 | Apache 2.0 | |
| Kandinsky | Kandinsky 5 Lite CFG-Distilled | T2V | 23.4GB | 768x512 | Euler FlowMatch | Frames:121 Steps:50 CFG:1.0 | Apache 2.0 | |
| Kandinsky | Kandinsky 5 Lite Steps-Distilled | T2V | 23.4GB | 768x512 | Euler FlowMatch | Frames:121 Steps:16 CFG:1.0 | Apache 2.0 |
Tip
Each model may require specific resolutions and parameters for best results.
This includes advanced parameters such as Sampler shift, which are usually not critical for text-to-image.
See each model's original notes for recommended settings.
[!NOTE]
It is recommended to use Default sampler unless you need a model-specific setting.
For example, to change Sampler Shift, select the matching sampler for that model.
Legacy models
Additional video models are available as individually selectable scripts in text or image interfaces.
- Stable Video Diffusion, Base, XY 1.0 and XT 1.1
- VGen
- AnimateDiff
LoRA
SD.Next includes LoRA support for Hunyuan, LTX, WAN, Mochi, Cog.
See LoRA for more details.
Optimizations
Warning
Any use on GPUs below 16GB and systems below 48GB RAM is experimental
Memory
Offloading helps by moving data between system RAM and GPU VRAM as needed.
However, the full model still must load into RAM before use.
Check total model size in the table and confirm your system has enough RAM.
Offloading
Enable offloading so model components can move in and out of VRAM as needed.
Most models support all offloading modes:
- Balanced: recommended, but may require extra tuning
- Model: simplest
- Sequential: highest memory savings, but slowest
See Offload for more details
Quantization
Enable on-the-fly quantization during load in Settings -> Quantization for additional memory savings.
- SDNQ
- BnB
- Optimum-Quanto
- TorchAO
You can enable quantization for Transformers and Text Encoder together or separately.
- Most T2V and I2V models support on-the-fly quantization of the transformer module.
- Most T2V models support text-encoder quantization, while many I2V models do not because image vectors cannot be quantized in the same way.
See Quantization for more details
Decoding
Instead of using only the full VAE packaged with the model to decode final frames, SD.Next also supports Tiny VAE and Remote VAE for video decoding.
- Tiny VAE: support for Hunyuan, WAN, Mochi
- Remote VAE: support for Hunyuan
See VAE for more details
Processing
SD.Next supports two optional acceleration methods:
- FasterCache
support for Hunyuan, Mochi, Latte, Allegro, Cog, WanDB, LTX - PyramidAttentionBroadcast
support for Hunyuan, Mochi, Latte, Allegro, Cog, WanDB, LTX
Interpolation
All video modules support optional frame interpolation for smoother output.
When enabled, interpolation uses RIFE (Real-Time Intermediate Flow Estimation).
Issues/Limitations
See TODO for known issues and limitations.