Memory Allocator
Combination of OS default memory allocator malloc
with Python's default memory allocator is pessimistic when it comes to system memory garbage collection and it will sometimes hold on to allocated memory longer than necessary even if GC is triggered explicitly
This appears to user as a memory leak as process memory usage grows over time
This is especially noticeable when frequently loading/unloading large objects such as models or LoRAs
Note
This applies to system memory only and has no impact on GPU memory management
Linux
Tip
For Linux deployments you can switch out memory allocator to tcmalloc
, jemalloc
or mimalloc
which are more efficient and have better memory management
tcmalloc
sudo apt install google-perftools
sudo ldconfig
export LD_PRELOAD=libtcmalloc.so.4
./webui.sh --debug
jemalloc
sudo apt install libjemalloc2
sudo ldconfig
export LD_PRELOAD=libjemalloc.so.2
./webui.sh --debug
mimalloc
sudo apt install libmimalloc2.0
sudo ldconfig
export LD_PRELOAD=libmimalloc.so.2
./webui.sh --debug
Windows
Tip
For Windows deployments you can switch out memory allocator to mimalloc
which is more efficient and has better memory management
mimalloc
mimalloc
for windows requires custom compiled library
Full instructions can be found here: https://microsoft.github.io/mimalloc/index.html
Compare
Using Ubuntu 24.04 and after executing 10 batches of 1024px images using SDXL model
manager | reserved memory | note |
---|---|---|
malloc | 9345 MB | baseline |
tcmaloc | 8423 MB | very performant and stable over long runs |
jemalloc | 5468 MB | best memory savings |
mimalloc | 6132 MB | new contender |
Note
Results will vary based on system and usage pattern
What matters is that memory allocator is can intelligently free up memory that is marked as available by SD.Next over time
while maintaining low memory fragmentation and allowing low-latency allocations
For example, when unloading LoRAs or switching models