Control Guide

screenshot-control

SD.Next's Control tab combines ControlNet, IP-Adapter, T2I-Adapter, ControlNet-XS, and ControlNet-LLLite in one place.

This guide explains the available settings in practical terms.

Controls

Input

Input controls define which images or videos are used for generation. By default, only the image in the Control input pane is used. If you select Separate init image, an additional Init input pane appears.

Note: If both Control input and Init input are used, Init input is dominant. Denoising strength at >= 0.9 helps rebalance influence toward Control input. Higher values increase Control input influence further.

Input control

Show Preview toggles the preview pane on the right side. Keep it enabled when doing masking or other pre-generation adjustments.

There are 3 input types:

Control only: Uses only Control input as the source for ControlNet and IP-Adapter workflows.
Init image same as control: Reuses Control input as an img2img init image.
Separate init image: Adds a separate Init input pane so control guidance and init image can differ.

Denoising strength works the same as in img2img. Higher values apply more denoising and allow stronger changes to source images.

Size

The Size section has two subtabs, Before and After. They control resizing before and after inference.

Size control

The Before subtab does 2 things:

If no Resize method is selected, it only sets output width and height, like standard text2img or img2img.
If you select a Resize method (for example, Nearest), you can upscale or downscale Control input before other operations. That resized image is then used by later steps. Second Pass is not fully functional yet.

For example, if you start with a large image (such as 2048x3072), you may want to reduce it before Canny or Depth processing to lower memory use and avoid OOM.

Select a resize method (typically Nearest or Lanczos), then either:

set target width or height under Fixed, or
switch to Scale and use a value below 1.

For example, scale 0.5 changes a 2048x3072 input to 1024x1536 for later processing.

The After subtab controls upscaling or downscaling at the end of generation. Typical choices are latent upscaling, ESRGAN models such as 4x Ultrasharp, or chaiNNer models. This matches standard text2img/img2img upscaling behavior.

Mask

Mask controls handle masking and segmentation behavior, plus preview display options.

Mask controls

Live update: Updates mask preview as you edit. If disabled, use Refresh. If preview gets out of sync during processing, refresh again.
Focus mask: Inpainting will apply only to areas you have masked if this is checked. You must actually inpaint something, otherwise it's just img2img.
Invert mask: Inverts the masking, things you mark with the brush will be excluded from a full mask of the image.
Auto-mask: Three options are available: Threshold, Edge, and Greyscale.
Auto-segment: Provides multiple Auto-segmentation models. These models do not require ControlNet, but can take a few seconds depending on GPU.
Preview: Select preview mode: Masked, Binary, Greyscale, Color, or Composite (default).
Colormap: Select the preview color scheme (22 options).
Blur: Softens mask edges.
Erode: Shrinks auto-mask or auto-segmentation borders.
Dilate: Expands auto-mask or auto-segmentation borders.

Video

Video controls let you process video frame by frame. Output formats are GIF, PNG, and MP4. You must select one to generate video output.

Some output modes expose additional options.

Video controls

Skip input frames: Controls sampling interval. 0 processes every frame, 1 every other frame, 2 every third frame, and so on.
Video file: Select output type: animated GIF, animated PNG, or MP4 (via FFMPEG).
Duration: The length in seconds you want your output video to be.
Pad frames: Determine how many frames to add to the beginning and end of the video. This feature is particularly useful when used with interpolation.
Interpolate frames: Number of RIFE-interpolated frames inserted between processed frames. Useful to smooth motion when skipping input frames.
Loop: This is purely for animated GIF and PNG output, it enables the classic looping that you would expect.

When interpolation is enabled, scene changes are also detected. If a scene changes significantly, pad frames are inserted instead of interpolating between unrelated frames.

Extensions

These built-in integrations are available without separate extension installs.

IP-Adapter

Built-in IP-Adapter includes 10 models for style and face guidance workflows.

IP-Adapter controls

Image Panes

Small icons above image panes are Interrogate buttons. The left button runs BLIP and the right button runs DeepBooru. Click one to analyze the image in that pane. Results appear in the prompt area.

Interrogate buttons

Control Input

Control input is the primary source pane. You can load an image or video here for processing by scripts, extensions, and control modules. If an image is present, the system assumes an img2img-style workflow. If a video is uploaded, input and output playback are available. Batch and folder processing work as expected.

Note below there are 2 other buttons, Inpainting and Outpainting, below.

Control input

ControlNet+

At the bottom of the Control page, full ControlNet support is available for SD and SD-XL.

Example workflows are still being expanded, but tooltips are available throughout the UI. If you need help, ask on the SD.Next Discord server.