Commits · 76b8728f0cbb630ebd3df2cee4abe8b66786bb1f · OpenDAS / ollama

12 Feb, 2024 1 commit

Detect AMD GPU info via sysfs and block old cards · 6d84f075

Daniel Hiltgen authored Feb 11, 2024

This wires up some new logic to start using sysfs to discover AMD GPU
information and detects old cards we can't yet support so we can fallback to CPU mode.

6d84f075

02 Feb, 2024 1 commit

Harden generate patching model · e1f50377

Daniel Hiltgen authored Feb 01, 2024

Only apply patches if we have any, and make sure to cleanup
every file we patched at the end to leave the tree clean

e1f50377

25 Jan, 2024 2 commits
- Fix clearing kv cache between requests with the same prompt (#2186) · a64570dc
  Jeffrey Morgan authored Jan 25, 2024
```
* Fix clearing kv cache between requests with the same prompt

* fix powershell script
```
  a64570dc
- Update gen_linux.sh to find libcudart in separate directory · a4564232
  mraiser authored Jan 25, 2024
  
  a4564232
23 Jan, 2024 1 commit

Refine Accelerate usage on mac · 0f5b8433

Daniel Hiltgen authored Jan 22, 2024

For old macs, accelerate seems to cause crashes, but for
AVX2 capable macs, it does not.

0f5b8433

21 Jan, 2024 1 commit

Make CPU builds parallel and customizable AMD GPUs · df54c723

Daniel Hiltgen authored Jan 21, 2024

The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.

df54c723

20 Jan, 2024 3 commits
- Add compute capability 5.0, 7.5, and 8.0 · a447a083
  Daniel Hiltgen authored Jan 20, 2024
  
  a447a083
- Add support for CUDA 5.2 cards · 681a9149
  Daniel Hiltgen authored Jan 20, 2024
  
  681a9149
- sign dylibs on macOS (#2101) · 4c54f0dd
  Jeffrey Morgan authored Jan 19, 2024
  
  4c54f0dd
19 Jan, 2024 1 commit
- use `gzip` for runner embedding (#2067) · dc88cc39
  Jeffrey Morgan authored Jan 19, 2024
  
  dc88cc39
17 Jan, 2024 1 commit
- Add multiple CPU variants for Intel Mac · 1b249748
  Daniel Hiltgen authored Jan 12, 2024
```
This also refines the build process for the ext_server build.
```
  1b249748
16 Jan, 2024 1 commit

Bump llama.cpp to b1842 and add new cuda lib dep · 795674dd

Daniel Hiltgen authored Jan 10, 2024

Upstream llama.cpp has added a new dependency with the
NVIDIA CUDA Driver Libraries (libcuda.so) which is part of the
driver distribution, not the general cuda libraries, and is not
available as an archive, so we can not statically link it.  This may
introduce some additional compatibility challenges which we'll
need to keep an eye on.

795674dd

14 Jan, 2024 2 commits
- Fix typo in arm mac arch script · 3ca5f69c
  Daniel Hiltgen authored Jan 14, 2024
  
  3ca5f69c
- Let gpu.go and gen_linux.sh also find CUDA on Arch Linux · f4bf1d51
  Alexander F. Rødseth authored Jan 14, 2024
  
  f4bf1d51
13 Jan, 2024 3 commits
- Fix intel mac build · 2ecb2472
  Daniel Hiltgen authored Jan 13, 2024
```
Make sure we're building an x86 ext_server lib when cross-compiling
```
  2ecb2472
- add `gcc -lstdc++` flag for linux cpu (#1974) · 288ef8ff
  Jeffrey Morgan authored Jan 13, 2024
  
  288ef8ff
- use g++ to build `libext_server.so` on linux (#1972) · 4cf17990
  Jeffrey Morgan authored Jan 13, 2024
  
  4cf17990
12 Jan, 2024 1 commit
- improve cuda detection (rel. issue #1704) · 905862e1
  Fabian Preiss authored Jan 09, 2024
  
  905862e1
11 Jan, 2024 3 commits

Always dynamically load the llm server library · 39928a42

Daniel Hiltgen authored Jan 09, 2024

This switches darwin to dynamic loading, and refactors the code now that no
static linking of the library is used on any platform

39928a42

Build multiple CPU variants and pick the best · d88c527b

Daniel Hiltgen authored Jan 07, 2024

This reduces the built-in linux version to not use any vector extensions
which enables the resulting builds to run under Rosetta on MacOS in
Docker. Then at runtime it checks for the actual CPU vector
extensions and loads the best CPU library available

d88c527b

Support multiple variants for a given llm lib type · 8da7bef0

Daniel Hiltgen authored Jan 05, 2024

In some cases we may want multiple variants for a given GPU type or CPU.
This adds logic to have an optional Variant which we can use to select
an optimal library, but also allows us to try multiple variants in case
some fail to load.

This can be useful for scenarios such as ROCm v5 vs v6 incompatibility
or potentially CPU features.

8da7bef0

09 Jan, 2024 2 commits
- clean up cmake `build` directory when cross compiling macOS builds · 34344d80
  Jeffrey Morgan authored Jan 09, 2024
  
  34344d80
- only build for metal on `arm64` · 8a8c7e7f
  Jeffrey Morgan authored Jan 09, 2024
  
  8a8c7e7f
07 Jan, 2024 1 commit
- add `-DCMAKE_SYSTEM_NAME=Darwin` cmake flag (#1832) · dbdd50b2
  Jeffrey Morgan authored Jan 07, 2024
  
  dbdd50b2
05 Jan, 2024 1 commit
- remove unused generate patches (#1810) · 3367b5f3
  Bruce MacDonald authored Jan 05, 2024
  
  3367b5f3
04 Jan, 2024 2 commits
- Cleaup stale submodule · 9983fa5f
  Daniel Hiltgen authored Jan 04, 2024
```
If the tree has a stale submodule, make sure we clean it up first
```
  9983fa5f
- Code shuffle to clean up the llm dir · 77d96da9
  Daniel Hiltgen authored Jan 04, 2024
  
  77d96da9