Prevent multiple concurrent loads on the same gpus
While models are loading, the VRAM metrics are dynamic, so try to load on a GPU that doesn't have a model actively loading, or wait to avoid races that lead to OOMs
Showing
Please register or sign in to comment