Memory Allocator
The combination of the OS default allocator (malloc) and Python's default allocator can be pessimistic for system-memory garbage collection.
In practice, memory can stay reserved longer than needed, even when GC is triggered explicitly.
This can look like a memory leak because process memory usage grows over time. It is most noticeable when frequently loading and unloading large objects such as models or LoRAs.
Note
This applies to system memory only and has no impact on GPU memory management
Linux
Tip
On Linux, you can switch to tcmalloc, jemalloc, or mimalloc.
These allocators are often more efficient and release memory more effectively.
tcmalloc
sudo apt install google-perftools
sudo ldconfig
export LD_PRELOAD=libtcmalloc.so.4
./webui.sh --debug
jemalloc
sudo apt install libjemalloc2
sudo ldconfig
export LD_PRELOAD=libjemalloc.so.2
./webui.sh --debug
Linux mimalloc
sudo apt install libmimalloc2.0
sudo ldconfig
export LD_PRELOAD=libmimalloc.so.2
./webui.sh --debug
Windows
Tip
On Windows, you can switch to mimalloc, which may improve memory handling.
Windows mimalloc
mimalloc on Windows requires a custom-built library.
Full instructions: https://microsoft.github.io/mimalloc/index.html
Compare
Ubuntu 24.04 results after running 10 batches of 1024px images with an SDXL model:
| manager | reserved memory | note |
|---|---|---|
| malloc | 9345 MB | baseline |
| tcmalloc | 8423 MB | very performant and stable over long runs |
| jemalloc | 5468 MB | best memory savings |
| mimalloc | 6132 MB | new contender |
Note
Results vary by system and usage pattern. The key behavior is whether the allocator can reclaim memory that SD.Next marks as available over time, while keeping fragmentation low and allocation latency low. This is especially visible when unloading LoRAs or switching models.