Commits · 63efa075a0e82688e4fb49fa0bcc081db5f2a5b7 · OpenDAS / ollama

07 Apr, 2024 1 commit
- update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to... · 63efa075
  Jeffrey Morgan authored Apr 07, 2024
```
update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528)
```
  63efa075
04 Apr, 2024 2 commits
- Fail fast if mingw missing on windows · 36bd9677
  Daniel Hiltgen authored Apr 04, 2024
  
  36bd9677
- fix dll compress in windows building · 4de01267
  mofanke authored Apr 04, 2024
  
  4de01267
03 Apr, 2024 2 commits
- Fix CI release glitches · e4a7e5b2
  Daniel Hiltgen authored Apr 03, 2024
```
The subprocess change moved the build directory
arm64 builds weren't setting cross-compilation flags when building on x86
```
  e4a7e5b2
- Fix macOS builds on older SDKs (#3467) · cd135317
  Jeffrey Morgan authored Apr 03, 2024
  
  cd135317
01 Apr, 2024 1 commit

Switch back to subprocessing for llama.cpp · 58d95cc9

Daniel Hiltgen authored Mar 14, 2024

This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.

58d95cc9

26 Mar, 2024 1 commit
- remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found (#3350) · 856b8ec1
  Jeffrey Morgan authored Mar 26, 2024
  
  856b8ec1
25 Mar, 2024 1 commit
- add support for libcudart.so for CUDA devices (adds Jetson support) · dfc6721b
  Jeremy authored Mar 25, 2024
  
  dfc6721b
15 Mar, 2024 2 commits
- Add Radeon gfx940-942 GPU support · d4c10df2
  Daniel Hiltgen authored Mar 15, 2024
  
  d4c10df2
- Wire up more complete CI for releases · 540f4af4
  Daniel Hiltgen authored Mar 07, 2024
```
Flesh out our github actions CI so we can build official releaes.
```
  540f4af4
12 Mar, 2024 1 commit
- Adapt our build for imported server.cpp · 85129d3a
  Daniel Hiltgen authored Mar 12, 2024
  
  85129d3a
11 Mar, 2024 2 commits
- update llama.cpp submodule to `ceca1ae` (#3064) · 369eda65
  Jeffrey Morgan authored Mar 11, 2024
  
  369eda65
- Avoid rocm runner and dependency clash · bc13da2b
  Daniel Hiltgen authored Mar 11, 2024
```
Putting the rocm symlink next to the runners is risky.  This moves
the payloads into a subdir to avoid potential clashes.
```
  bc13da2b
10 Mar, 2024 2 commits
- Harden for deps file being empty (or short) · 3dc1bb6a
  Daniel Hiltgen authored Mar 10, 2024
  
  3dc1bb6a
- add `bundle_metal` and `cleanup_metal` funtions to `gen_darwin.sh` · e11668aa
  Jeffrey Morgan authored Mar 09, 2024
  
  e11668aa
09 Mar, 2024 1 commit
- update llama.cpp submodule to `77d1ac7` (#3030) · 1ffb1e28
  Jeffrey Morgan authored Mar 09, 2024
  
  1ffb1e28
07 Mar, 2024 2 commits

Revamp ROCm support · 6c5ccb11

Daniel Hiltgen authored Feb 15, 2024

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.

6c5ccb11

fix some typos (#2973) · 23ebe8fe
John authored Mar 07, 2024
```
Signed-off-by: hishope <csqiye@126.com>
```
23ebe8fe

29 Feb, 2024 1 commit

Omit build date from gzip headers · 76e5d9ec

Bernhard M. Wiedemann authored Feb 29, 2024

See https://reproducible-builds.org/ for why this is good.

This patch was done while working on reproducible builds for openSUSE.

76e5d9ec

21 Feb, 2024 1 commit
- reset with `init_vars` ahead of each cpu build in `gen_windows.ps1` (#2654) · efe040f8
  Jeffrey Morgan authored Feb 21, 2024
  
  efe040f8
16 Feb, 2024 1 commit
- Fix duplicate menus on update and exit on signals · df6dc4fd
  Daniel Hiltgen authored Feb 16, 2024
```
Also fixes a few fit-and-finish items for better developer experience
```
  df6dc4fd
15 Feb, 2024 2 commits

Explicitly disable AVX2 on GPU builds · db2a9ad1

Daniel Hiltgen authored Feb 15, 2024

Even though we weren't setting it to on, somewhere in the cmake config
it was getting toggled on.  By explicitly setting it to off, we get `/arch:AVX`
as intended.

db2a9ad1

Implement new Go based Desktop app · 29e90cc1

Daniel Hiltgen authored Dec 26, 2023

This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.

29e90cc1

12 Feb, 2024 1 commit

Detect AMD GPU info via sysfs and block old cards · 6d84f075

Daniel Hiltgen authored Feb 11, 2024

This wires up some new logic to start using sysfs to discover AMD GPU
information and detects old cards we can't yet support so we can fallback to CPU mode.

6d84f075

02 Feb, 2024 1 commit

Harden generate patching model · e1f50377

Daniel Hiltgen authored Feb 01, 2024

Only apply patches if we have any, and make sure to cleanup
every file we patched at the end to leave the tree clean

e1f50377

25 Jan, 2024 2 commits
- Fix clearing kv cache between requests with the same prompt (#2186) · a64570dc
  Jeffrey Morgan authored Jan 25, 2024
```
* Fix clearing kv cache between requests with the same prompt

* fix powershell script
```
  a64570dc
- Update gen_linux.sh to find libcudart in separate directory · a4564232
  mraiser authored Jan 25, 2024
  
  a4564232
23 Jan, 2024 1 commit

Refine Accelerate usage on mac · 0f5b8433

Daniel Hiltgen authored Jan 22, 2024

For old macs, accelerate seems to cause crashes, but for
AVX2 capable macs, it does not.

0f5b8433

21 Jan, 2024 1 commit

Make CPU builds parallel and customizable AMD GPUs · df54c723

Daniel Hiltgen authored Jan 21, 2024

The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.

df54c723

20 Jan, 2024 3 commits
- Add compute capability 5.0, 7.5, and 8.0 · a447a083
  Daniel Hiltgen authored Jan 20, 2024
  
  a447a083
- Add support for CUDA 5.2 cards · 681a9149
  Daniel Hiltgen authored Jan 20, 2024
  
  681a9149
- sign dylibs on macOS (#2101) · 4c54f0dd
  Jeffrey Morgan authored Jan 19, 2024
  
  4c54f0dd
19 Jan, 2024 1 commit
- use `gzip` for runner embedding (#2067) · dc88cc39
  Jeffrey Morgan authored Jan 19, 2024
  
  dc88cc39
17 Jan, 2024 1 commit
- Add multiple CPU variants for Intel Mac · 1b249748
  Daniel Hiltgen authored Jan 12, 2024
```
This also refines the build process for the ext_server build.
```
  1b249748
16 Jan, 2024 1 commit

Bump llama.cpp to b1842 and add new cuda lib dep · 795674dd

Daniel Hiltgen authored Jan 10, 2024

Upstream llama.cpp has added a new dependency with the
NVIDIA CUDA Driver Libraries (libcuda.so) which is part of the
driver distribution, not the general cuda libraries, and is not
available as an archive, so we can not statically link it.  This may
introduce some additional compatibility challenges which we'll
need to keep an eye on.

795674dd

14 Jan, 2024 2 commits
- Fix typo in arm mac arch script · 3ca5f69c
  Daniel Hiltgen authored Jan 14, 2024
  
  3ca5f69c
- Let gpu.go and gen_linux.sh also find CUDA on Arch Linux · f4bf1d51
  Alexander F. Rødseth authored Jan 14, 2024
  
  f4bf1d51
13 Jan, 2024 3 commits
- Fix intel mac build · 2ecb2472
  Daniel Hiltgen authored Jan 13, 2024
```
Make sure we're building an x86 ext_server lib when cross-compiling
```
  2ecb2472
- add `gcc -lstdc++` flag for linux cpu (#1974) · 288ef8ff
  Jeffrey Morgan authored Jan 13, 2024
  
  288ef8ff
- use g++ to build `libext_server.so` on linux (#1972) · 4cf17990
  Jeffrey Morgan authored Jan 13, 2024
  
  4cf17990