Commits · de2fbdec991ac52ff015818b19482fdff22e2deb · OpenDAS / ollama

"vscode:/vscode.git/clone" did not exist on "e8616cc8ba5fd50568ac6e54d5fb71cb383624c0"

11 Jan, 2024 3 commits

Always dynamically load the llm server library · 39928a42

Daniel Hiltgen authored Jan 09, 2024

This switches darwin to dynamic loading, and refactors the code now that no
static linking of the library is used on any platform

39928a42

Build multiple CPU variants and pick the best · d88c527b

Daniel Hiltgen authored Jan 07, 2024

This reduces the built-in linux version to not use any vector extensions
which enables the resulting builds to run under Rosetta on MacOS in
Docker. Then at runtime it checks for the actual CPU vector
extensions and loads the best CPU library available

d88c527b

Support multiple variants for a given llm lib type · 8da7bef0

Daniel Hiltgen authored Jan 05, 2024

In some cases we may want multiple variants for a given GPU type or CPU.
This adds logic to have an optional Variant which we can use to select
an optimal library, but also allows us to try multiple variants in case
some fail to load.

This can be useful for scenarios such as ROCm v5 vs v6 incompatibility
or potentially CPU features.

8da7bef0

04 Jan, 2024 3 commits

Code shuffle to clean up the llm dir · 77d96da9
Daniel Hiltgen authored Jan 04, 2024

77d96da9

update cmake flags for `amd64` macOS (#1780) · 29340c2e

Jeffrey Morgan authored Jan 03, 2024

* update cmake flags for intel macOS

* remove `LLAMA_K_QUANTS`

* put back `CMAKE_OSX_DEPLOYMENT_TARGET` and disable `LLAMA_F16C`

29340c2e

Fix CPU only builds · ddbfa6fe

Daniel Hiltgen authored Jan 03, 2024

Go embed doesn't like when there's no matching files, so put
a dummy placeholder in to allow building without any GPU support
If no "server" library is found, it's safely ignored at runtime.

ddbfa6fe

03 Jan, 2024 1 commit

Improve maintainability of Radeon card list · 16f4603b

Daniel Hiltgen authored Jan 03, 2024

This moves the list of AMD GPUs to an easier to maintain list which
should make it easier to update over time.

16f4603b

02 Jan, 2024 2 commits

Switch windows build to fully dynamic · d966b730

Daniel Hiltgen authored Dec 23, 2023

Refactor where we store build outputs, and support a fully dynamic loading
model on windows so the base executable has no special dependencies thus
doesn't require a special PATH.

d966b730

Refactor how we augment llama.cpp · 9a70aecc

Daniel Hiltgen authored Dec 22, 2023

This changes the model for llama.cpp inclusion so we're not applying a patch,
but instead have the C++ code directly in the ollama tree, which should make it
easier to refine and update over time.

9a70aecc

22 Dec, 2023 1 commit
- Remove CPU build, fixup linux build script · fa24e73b
  Daniel Hiltgen authored Dec 21, 2023
  
  fa24e73b
20 Dec, 2023 1 commit

Revamp the dynamic library shim · 7555ea44

Daniel Hiltgen authored Dec 20, 2023

This switches the default llama.cpp to be CPU based, and builds the GPU variants
as dynamically loaded libraries which we can select at runtime.

This also bumps the ROCm library to version 6 given 5.7 builds don't work
on the latest ROCm library that just shipped.

7555ea44

19 Dec, 2023 3 commits
- Refine build to support CPU only · 1b991d0b
  Daniel Hiltgen authored Dec 13, 2023
```
If someone checks out the ollama repo and doesn't install the CUDA
library, this will ensure they can build a CPU only version
```
  1b991d0b
- Adapted rocm support to cgo based llama.cpp · 35934b2e
  Daniel Hiltgen authored Nov 29, 2023
  
  35934b2e
- Add cgo implementation for llama.cpp · d4cd6957
  Daniel Hiltgen authored Nov 13, 2023
```
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
```
  d4cd6957