"ml/backend/git@developer.sourcefind.cn:OpenDAS/ollama.git" did not exist on "764e199d6703d80da4d245381efa4a3a412813b2"
Feat: Clear cache during weight loading to prevent OOM on GPUs with <=8GB VRAM
This change explicitly clears CUDA cache during weight loading to mitigate memory fragmentation issues, particularly beneficial for low-VRAM GPUs.
Showing
Please register or sign in to comment