1. 15 Oct, 2025 1 commit
    • Santosh Bhavani's avatar
      ml/backend/ggml: NVML fallback for unified memory GPUs (#12619) · 8fafc8af
      Santosh Bhavani authored
      * Simplify NVML fallback for unified memory GPUs
      
      Remove device-specific checks and environment variable dependency for
      NVML_ERROR_NOT_SUPPORTED fallback. When NVML doesn't support memory
      queries, unconditionally use /proc/meminfo instead of checking device
      names or OLLAMA_UNIFIED_MEMORY environment variable.
      
      This provides better memory reporting by using MemAvailable which
      accounts for reclaimable memory, avoiding the underreporting issue
      described in NVIDIA support article a_id/5728.
      
      Tested on NVIDIA GB10 unified memory iGPU with consistent and accurate
      memory reporting across multiple model load/unload cycles.
      
      * Add NVML fallback patch for unified memory GPUs
      8fafc8af