Commits · 28a64e23ca320861a68ca89773bf7b41d965fbb2 · OpenDAS / ollama

25 Mar, 2024 1 commit
- add support for libcudart.so for CUDA devices (adds Jetson support) · dfc6721b
  Jeremy authored Mar 25, 2024
  
  dfc6721b
12 Mar, 2024 1 commit
- fix gpu_info_cuda.c compile warning (#3077) · 51578d85
  mofanke authored Mar 13, 2024
  
  51578d85
07 Mar, 2024 1 commit

Daniel Hiltgen authored Feb 15, 2024

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.

6c5ccb11

29 Feb, 2024 1 commit
- fix: print usedMemory size right (#2827) · fa2f2b35
  tylinux authored Mar 01, 2024
  
  fa2f2b35
26 Jan, 2024 1 commit
- Fix crash on cuda ml init failure · 5d9c4a5f
  Daniel Hiltgen authored Jan 26, 2024
```
The new driver lookup code was triggering after init failure due to a missing return
```
  5d9c4a5f
24 Jan, 2024 1 commit

More logging for gpu management · 013fd071

Daniel Hiltgen authored Jan 24, 2024

Fix an ordering glitch of dlerr/dlclose and add more logging to help
root cause some crashes users are hitting. This also refines the
function pointer names to use the underlying function names instead
of simplified names for readability.

013fd071

23 Jan, 2024 1 commit

Report more information about GPUs in verbose mode · 987c16b2

Daniel Hiltgen authored Jan 22, 2024

This adds additional calls to both CUDA and ROCm management libraries to
discover additional attributes about the GPU(s) detected in the system, and
wires up runtime verbosity selection. When users hit problems with GPUs we can
ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the results.

987c16b2

10 Jan, 2024 1 commit

Harden GPU mgmt library lookup · 3c49c3ab

Daniel Hiltgen authored Jan 10, 2024

When there are multiple management libraries installed on a system
not every one will be compatible with the current driver. This change
improves our management library algorithm to build up a set of discovered
libraries based on glob patterns, and then try all of them until we're able to
load one without error.

3c49c3ab

09 Jan, 2024 1 commit
- calculate overhead based number of gpu devices (#1875) · c336693f
  Jeffrey Morgan authored Jan 09, 2024
  
  c336693f
07 Jan, 2024 1 commit

Detect very old CUDA GPUs and fall back to CPU · d74ce6bd

Daniel Hiltgen authored Jan 06, 2024

If we try to load the CUDA library on an old GPU, it panics and crashes
the server. This checks the compute capability before we load the
library so we can gracefully fall back to CPU mode.

d74ce6bd

06 Jan, 2024 1 commit
- add cuda lib path for nvidia container toolkit · 1caa5612
  Jeffrey Morgan authored Jan 05, 2024
  
  1caa5612
05 Jan, 2024 1 commit

gpu: read memory info from all cuda devices (#1802) · df325373

Jeffrey Morgan authored Jan 05, 2024

* gpu: read memory info from all cuda devices

* add `LOOKUP_SIZE` constant

* better constant name

* address comments

df325373

03 Jan, 2024 1 commit

Fix windows system memory lookup · a2ad9524

Daniel Hiltgen authored Dec 22, 2023

This refines the gpu package error handling and fixes a bug with the
system memory lookup on windows.

a2ad9524

19 Dec, 2023 4 commits
- Additional nvidial-ml path to check · 1d1eb168
  Daniel Hiltgen authored Dec 19, 2023
  
  1d1eb168
- Add WSL2 path to nvidia-ml.so library · 5646826a
  Daniel Hiltgen authored Dec 15, 2023
  
  5646826a
- Refine build to support CPU only · 1b991d0b
  Daniel Hiltgen authored Dec 13, 2023
```
If someone checks out the ollama repo and doesn't install the CUDA
library, this will ensure they can build a CPU only version
```
  1b991d0b
- Adapted rocm support to cgo based llama.cpp · 35934b2e
  Daniel Hiltgen authored Nov 29, 2023
  
  35934b2e