1. 23 Jun, 2025 1 commit
    • Daniel Hiltgen's avatar
      Re-remove cuda v11 (#10694) · 1c6669e6
      Daniel Hiltgen authored
      * Re-remove cuda v11
      
      Revert the revert - drop v11 support requiring drivers newer than Feb 23
      
      This reverts commit c6bcdc42.
      
      * Simplify layout
      
      With only one version of the GPU libraries, we can simplify things down somewhat.  (Jetsons still require special handling)
      
      * distinct sbsa variant for linux arm64
      
      This avoids accidentally trying to load the sbsa cuda libraries on
      a jetson system which results in crashes.
      
      * temporary prevent rocm+cuda mixed loading
      1c6669e6
  2. 13 May, 2025 1 commit
  3. 07 May, 2025 1 commit
    • Daniel Hiltgen's avatar
      remove cuda v11 (#10569) · fa393554
      Daniel Hiltgen authored
      This reduces the size of our Windows installer payloads by ~256M by dropping
      support for nvidia drivers older than Feb 2023.  Hardware support is unchanged.
      
      Linux default bundle sizes are reduced by ~600M to 1G.
      fa393554
  4. 25 Feb, 2025 1 commit
  5. 17 Oct, 2024 1 commit
  6. 04 Sep, 2024 1 commit
  7. 19 Aug, 2024 2 commits
  8. 04 Jun, 2024 1 commit
  9. 23 Apr, 2024 1 commit
    • Daniel Hiltgen's avatar
      Request and model concurrency · 34b9db5a
      Daniel Hiltgen authored
      This change adds support for multiple concurrent requests, as well as
      loading multiple models by spawning multiple runners. The default
      settings are currently set at 1 concurrent request per model and only 1
      loaded model at a time, but these can be adjusted by setting
      OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
      34b9db5a