Memory Allocator

The combination of the OS default allocator (malloc) and Python's default allocator can be pessimistic for system-memory garbage collection. In practice, memory can stay reserved longer than needed, even when GC is triggered explicitly.

This can look like a memory leak because process memory usage grows over time. It is most noticeable when frequently loading and unloading large objects such as models or LoRAs.

Note

This applies to system memory only and has no impact on GPU memory management

Linux

Tip

On Linux, you can switch to tcmalloc, jemalloc, or mimalloc. These allocators are often more efficient and release memory more effectively.

tcmalloc

sudo apt install google-perftools
sudo ldconfig
export LD_PRELOAD=libtcmalloc.so.4
./webui.sh --debug

jemalloc

sudo apt install libjemalloc2
sudo ldconfig
export LD_PRELOAD=libjemalloc.so.2
./webui.sh --debug

Linux mimalloc

sudo apt install libmimalloc2.0
sudo ldconfig
export LD_PRELOAD=libmimalloc.so.2
./webui.sh --debug

Windows

Tip

On Windows, you can switch to mimalloc, which may improve memory handling.

Windows mimalloc

mimalloc on Windows requires a custom-built library. Full instructions: https://microsoft.github.io/mimalloc/index.html

Compare

Ubuntu 24.04 results after running 10 batches of 1024px images with an SDXL model:

manager	reserved memory	note
malloc	9345 MB	baseline
tcmalloc	8423 MB	very performant and stable over long runs
jemalloc	5468 MB	best memory savings
mimalloc	6132 MB	new contender

Note

Results vary by system and usage pattern. The key behavior is whether the allocator can reclaim memory that SD.Next marks as available over time, while keeping fragmentation low and allocation latency low. This is especially visible when unloading LoRAs or switching models.