• Santosh Bhavani's avatar
    ml/backend/ggml: NVML fallback for unified memory GPUs (#12619) · 8fafc8af
    Santosh Bhavani authored
    * Simplify NVML fallback for unified memory GPUs
    
    Remove device-specific checks and environment variable dependency for
    NVML_ERROR_NOT_SUPPORTED fallback. When NVML doesn't support memory
    queries, unconditionally use /proc/meminfo instead of checking device
    names or OLLAMA_UNIFIED_MEMORY environment variable.
    
    This provides better memory reporting by using MemAvailable which
    accounts for reclaimable memory, avoiding the underreporting issue
    described in NVIDIA support article a_id/5728.
    
    Tested on NVIDIA GB10 unified memory iGPU with consistent and accurate
    memory reporting across multiple model load/unload cycles.
    
    * Add NVML fallback patch for unified memory GPUs
    8fafc8af
0029-NVML-fallback-for-unified-memory-GPUs.patch 5.9 KB