Skip to content

AMD ROCm

To use AMD ROCm with SD.Next: 1. Install ROCm libraries. 2. Run SD.Next with --use-rocm so it installs a compatible torch build.

Important

AMD ROCm is officially supported for specific AMD GPUs.

[!IMPORTANT]
Currently, PyTorch support on Windows is not officially maintained by PyTorch team.
See AMD's announcement for more information.

[!WARNING]
Unofficial support for other platforms is provided by the community and SD.Next does not guarantee it will work.
Use of any third-party libraries is at your own risk.

  • For preview support on Windows platform, see ROCm on Windows section.
  • For unofficial support for Windows platform, see ZLUDA page.

ROCm on Linux

Install Guide for Ubuntu 24.04

Install ROCm:

sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.4.3/ubuntu/noble/amdgpu-install_6.4.60403-1_all.deb
sudo apt install ./amdgpu-install_6.4.60403-1_all.deb
sudo amdgpu-install --usecase=rocm
sudo usermod -a -G render,video $LOGNAME

Install git and python:

sudo apt install git python3 python3-dev python3-venv python3-pip

Install Guide for Ubuntu 22.04

Simply change the wget line from "noble" to "jammy" if using Ubuntu 22.04.

Install Guide for openSUSE Tumbleweed

Install prerequisites:

sudo zypper in python312-devel python312-virtualenv python312-pip patterns-devel-base-devel_basis

Add the ROCm repository (not official, but maintained by AMD employees):

sudo zypper ar obs://science:GPU:ROCm/openSUSE_Factory ROCm
sudo zypper ref # Answer "ultimately trust"

Install the relevant packages (there is no pattern so you must install them all manually):

sudo zypper in rocm-runtime \
    miopen rccl rocblas amdsmi \
    hipblaslt hiprand hipcub \
    hipsolver hipfft rocm-cmake \
    rocm-compilersupport \
    rocm-llvm-filesystem \
    rocm-clang-runtime-devel \
    hipcub-devel rocm-hip-devel \
    libhipfft0-devel libhipsolver0-devel \
    libhipsparse1-devel rocthrust-devel \
    librocfft0 rocm-core rocrand rocsolver 
  ```

This procedure should also work for Leap-based distributions and Slowroll (adjust the distro-specific lines), but it is not tested.

**Note**: This also installs build dependencies for flash-attention.

### Install Guide for Arch Linux

Install ROCm and git:  

```shell
sudo pacman -S rocm-hip-runtime git

Install Python 3.12 (or anything between 3.10 and 3.13):

sudo pacman -S base-devel python-pip python-virtualenv
git clone https://aur.archlinux.org/python312.git
cd python312
makepkg -si
cd ..
export PYTHON=python3.12

# remove the package builder residuals:
# rm -rf python312

Install ROCm SDK:

[!NOTE] ROCm SDK is optional. It is only required for building flash attention or similar custom kernels.
ROCm SDK uses 26 GB of disk space.

sudo pacman -S rocm-hip-sdk libxml2-legacy gcc14 gcc14-libs

Running SD.Next with ROCm locally

Open a terminal in the folder where you want to install SD.Next, then run:

git clone https://github.com/vladmandic/sdnext
```bash

Then enter the `sdnext` folder:  

```shell
cd sdnext

Run SD.Next with:

./webui.sh --use-rocm

[!NOTE] It will install the necessary libraries at the first run so it will take a while depending on your internet.

Running SD.Next with Docker for ROCm

See Docker if you want to build a custom image.

[!NOTE] Installing ROCm on your system is not required when using Docker as Docker has no access to it anyway.

To run a prebuilt Docker image:

export SDNEXT_DOCKER_ROOT_FOLDER=~/sdnext
sudo docker run -it \
  --name sdnext-rocm \
  --device /dev/dri \
  --device /dev/kfd \
  -p 7860:7860 \
  -v $SDNEXT_DOCKER_ROOT_FOLDER/app:/app \
  -v $SDNEXT_DOCKER_ROOT_FOLDER/python:/mnt/python \
  -v $SDNEXT_DOCKER_ROOT_FOLDER/data:/mnt/data \
  -v $SDNEXT_DOCKER_ROOT_FOLDER/models:/mnt/models \
  -v $SDNEXT_DOCKER_ROOT_FOLDER/huggingface:/root/.cache/huggingface \
  disty0/sdnext-rocm:latest

[!NOTE] It will install the necessary libraries at the first run so it will take a while depending on your internet.
Resulting docker image will use 3.2 GB disk space (uncompressed) for the docker image and 20 GB for the venv.

ROCm performance tuning on Linux

For details, see AMD-MIOpen Guide.

MIOpen database tuning

On first use, on first use of a new resolution, or after a PyTorch upgrade, ROCm runs benchmarks to pick efficient kernels.
This can make startup slow (up to 5-8 minutes), especially with high-resolution refine passes, but it usually happens once per resolution.

If startup time is the priority, set MIOPEN_FIND_MODE=FAST.
If generation performance is the priority, set MIOPEN_FIND_ENFORCE=SEARCH and accept slower first-time startup.

Reduce VRAM consumption

If you use bf16 (Settings > Compute Settings > Execution Precision > Device precision type), which is auto-detected on RDNA3 and newer cards, VRAM usage can become very high (16+ GB) during final decode and non-latent upscaling.
To reduce usage, set Device precision type to fp16 and disable VAE upcasting in Variational Auto Encoder > VAE upcasting.

Using fp16 can also improve performance.

Composable Kernel (CK) Flash attention

On RDNA3 hardware (RX 7000 series), you can enable CK Flash Attention in Compute Settings > Cross Attention > SDP Options by toggling CK Flash attention and restarting SD.Next.
This requires rocm-hip-sdk, because SD.Next downloads and compiles an additional Python package at startup.

In case you want to install it manually, activate the virtual environment then run pip:

pip install --no-build-isolation git+https://github.com/Disty0/flash-attention@navi_rotary_fix

ROCm on Windows

  1. Install Git and Python 3.12.
  2. Open the terminal in a folder you want to install SD.Next and install SD.Next from GitHub with this command:
git clone https://github.com/vladmandic/sdnext
  1. Enter into the sdnext folder:
cd sdnext
  1. Make sure that you are up to date.
git pull
  1. Run SD.Next with this command:
./webui.bat --use-rocm