llama/patches/0029-NVML-fallback-for-unified-memory-GPUs.patch · 8fafc8af77105030ce485c96c355dafce316ec24 · OpenDAS / ollama

ml/backend/ggml: NVML fallback for unified memory GPUs (#12619) · 8fafc8af

Santosh Bhavani authored Oct 15, 2025

* Simplify NVML fallback for unified memory GPUs

Remove device-specific checks and environment variable dependency for
NVML_ERROR_NOT_SUPPORTED fallback. When NVML doesn't support memory
queries, unconditionally use /proc/meminfo instead of checking device
names or OLLAMA_UNIFIED_MEMORY environment variable.

This provides better memory reporting by using MemAvailable which
accounts for reclaimable memory, avoiding the underreporting issue
described in NVIDIA support article a_id/5728.

Tested on NVIDIA GB10 unified memory iGPU with consistent and accurate
memory reporting across multiple model load/unload cycles.

* Add NVML fallback patch for unified memory GPUs

8fafc8af

0029-NVML-fallback-for-unified-memory-GPUs.patch 5.9 KB

Replace 0029-NVML-fallback-for-unified-memory-GPUs.patch