Commits · f63dc2db5c00de0f6b0b5ea9b53bb20e83513cea · OpenDAS / ollama

23 Jan, 2024 1 commit

Report more information about GPUs in verbose mode · 987c16b2

Daniel Hiltgen authored Jan 22, 2024

This adds additional calls to both CUDA and ROCm management libraries to
discover additional attributes about the GPU(s) detected in the system, and
wires up runtime verbosity selection. When users hit problems with GPUs we can
ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the results.

987c16b2

10 Jan, 2024 1 commit

Harden GPU mgmt library lookup · 3c49c3ab

Daniel Hiltgen authored Jan 10, 2024

When there are multiple management libraries installed on a system
not every one will be compatible with the current driver. This change
improves our management library algorithm to build up a set of discovered
libraries based on glob patterns, and then try all of them until we're able to
load one without error.

3c49c3ab

07 Jan, 2024 1 commit

Detect very old CUDA GPUs and fall back to CPU · d74ce6bd

Daniel Hiltgen authored Jan 06, 2024

If we try to load the CUDA library on an old GPU, it panics and crashes
the server. This checks the compute capability before we load the
library so we can gracefully fall back to CPU mode.

d74ce6bd

05 Jan, 2024 1 commit

gpu: read memory info from all cuda devices (#1802) · df325373

Jeffrey Morgan authored Jan 05, 2024

* gpu: read memory info from all cuda devices

* add `LOOKUP_SIZE` constant

* better constant name

* address comments

df325373

19 Dec, 2023 1 commit
- Adapted rocm support to cgo based llama.cpp · 35934b2e
  Daniel Hiltgen authored Nov 29, 2023
  
  35934b2e