Commits · b80661e8c78e115ed9b41391c87fdb7f1a7f69ec · OpenDAS / ollama

11 Mar, 2024 3 commits
- relay load model errors to the client (#3065) · b80661e8
  Bruce MacDonald authored Mar 11, 2024
  
  b80661e8
- update llama.cpp submodule to `ceca1ae` (#3064) · 369eda65
  Jeffrey Morgan authored Mar 11, 2024
  
  369eda65
- Avoid rocm runner and dependency clash · bc13da2b
  Daniel Hiltgen authored Mar 11, 2024
```
Putting the rocm symlink next to the runners is risky.  This moves
the payloads into a subdir to avoid potential clashes.
```
  bc13da2b
10 Mar, 2024 4 commits
- fix `03-locale.diff` · 41b00b98
  Jeffrey Morgan authored Mar 10, 2024
  
  41b00b98
- Harden for deps file being empty (or short) · 3dc1bb6a
  Daniel Hiltgen authored Mar 10, 2024
  
  3dc1bb6a
- patch: use default locale in wpm tokenizer (#3034) · 908005d9
  Jeffrey Morgan authored Mar 09, 2024
  
  908005d9
- add `bundle_metal` and `cleanup_metal` funtions to `gen_darwin.sh` · e11668aa
  Jeffrey Morgan authored Mar 09, 2024
  
  e11668aa
09 Mar, 2024 4 commits
- update llama.cpp submodule to `77d1ac7` (#3030) · 1ffb1e28
  Jeffrey Morgan authored Mar 09, 2024
  
  1ffb1e28
- disable gpu for certain model architectures and fix divide-by-zero on memory estimation · f9cd55c7
  Jeffrey Morgan authored Mar 09, 2024
  
  f9cd55c7
- Finish unwinding idempotent payload logic · 4a5c9b80
  Daniel Hiltgen authored Mar 08, 2024
```
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent.  This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
```
  4a5c9b80
- update llama.cpp submodule to `c2101a2` (#3020) · efe5617b
  Jeffrey Morgan authored Mar 09, 2024
  
  efe5617b
08 Mar, 2024 2 commits
- decode ggla · 76bdebba
  Michael Yang authored Mar 08, 2024
  
  76bdebba
- update llama.cpp submodule to `6cdabe6` (#2999) · 0e4669b0
  Jeffrey Morgan authored Mar 08, 2024
  
  0e4669b0
07 Mar, 2024 3 commits

Daniel Hiltgen authored Feb 15, 2024

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.

6c5ccb11

fix some typos (#2973) · 23ebe8fe
John authored Mar 07, 2024
```
Signed-off-by: hishope <csqiye@126.com>
```
23ebe8fe
Convert Safetensors to an Ollama model (#2824) · 2c017ca4
Patrick Devine authored Mar 06, 2024

2c017ca4

01 Mar, 2024 1 commit
- update llama.cpp submodule to `c29af7e` (#2868) · 21347e1e
  Jeffrey Morgan authored Mar 01, 2024
  
  21347e1e
29 Feb, 2024 2 commits
- bump submodule to `87c91c07663b707e831c59ec373b5e665ff9d64a` (#2828) · cbf4970e
  Jeffrey Morgan authored Feb 29, 2024
  
  cbf4970e
- Omit build date from gzip headers · 76e5d9ec
  Bernhard M. Wiedemann authored Feb 29, 2024
```
See https://reproducible-builds.org/ for why this is good.

This patch was done while working on reproducible builds for openSUSE.
```
  76e5d9ec
27 Feb, 2024 1 commit
- Bump llama.cpp to b2276 · 061e8f6a
  Daniel Hiltgen authored Feb 26, 2024
  
  061e8f6a
22 Feb, 2024 1 commit
- update llama.cpp submodule to `96633eeca1265ed03e57230de54032041c58f9cd` · 11bfff8e
  Jeffrey Morgan authored Feb 22, 2024
  
  11bfff8e
21 Feb, 2024 4 commits
- reset with `init_vars` ahead of each cpu build in `gen_windows.ps1` (#2654) · efe040f8
  Jeffrey Morgan authored Feb 21, 2024
  
  efe040f8
- update llama.cpp submodule to `c14f72d` · 2a7553ce
  Jeffrey Morgan authored Feb 21, 2024
  
  2a7553ce
- update llama.cpp submodule to `f0d1fafc029a056cd765bdae58dcaa12312e9879` · b3eac61c
  Jeffrey Morgan authored Feb 20, 2024
  
  b3eac61c
- add gguf file types (#2532) · 949d7b1c
  Michael Yang authored Feb 20, 2024
  
  949d7b1c
20 Feb, 2024 2 commits
- update llama.cpp submodule to `66c1968f7` (#2618) · 4613a080
  Jeffrey Morgan authored Feb 20, 2024
  
  4613a080
- [nit] Remove unused msg local var. (#2511) · 01ff2e14
  Taras Tsugrii authored Feb 20, 2024
  
  01ff2e14
19 Feb, 2024 1 commit

Fix cuda leaks · fc39a6cd

Daniel Hiltgen authored Feb 18, 2024

This should resolve the problem where we don't fully unload from the GPU
when we go idle.

fc39a6cd

16 Feb, 2024 1 commit
- Fix duplicate menus on update and exit on signals · df6dc4fd
  Daniel Hiltgen authored Feb 16, 2024
```
Also fixes a few fit-and-finish items for better developer experience
```
  df6dc4fd
15 Feb, 2024 2 commits

Explicitly disable AVX2 on GPU builds · db2a9ad1

Daniel Hiltgen authored Feb 15, 2024

Even though we weren't setting it to on, somewhere in the cmake config
it was getting toggled on.  By explicitly setting it to off, we get `/arch:AVX`
as intended.

db2a9ad1

Implement new Go based Desktop app · 29e90cc1

Daniel Hiltgen authored Dec 26, 2023

This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.

29e90cc1

14 Feb, 2024 4 commits
- Revert "Revert "bump submodule to `6c00a06` (#2479)"" (#2485) · 9241a293
  Jeffrey Morgan authored Feb 13, 2024
```
This reverts commit 6920964b.
```
  9241a293
- set `shutting_down` to `false` once shutdown is complete (#2484) · f7231ad9
  Jeffrey Morgan authored Feb 13, 2024
  
  f7231ad9
- Revert "bump submodule to `6c00a06` (#2479)" · 6920964b
  Jeffrey Morgan authored Feb 13, 2024
```
This reverts commit 2f9ed52b.
```
  6920964b
- bump submodule to `6c00a06` (#2479) · 2f9ed52b
  Jeffrey Morgan authored Feb 13, 2024
  
  2f9ed52b
12 Feb, 2024 3 commits
- update submodule to `099afc6` (#2468) · f76ca04f
  Jeffrey Morgan authored Feb 12, 2024
  
  f76ca04f
- Detect AMD GPU info via sysfs and block old cards · 6d84f075
  Daniel Hiltgen authored Feb 11, 2024
```
This wires up some new logic to start using sysfs to discover AMD GPU
information and detects old cards we can't yet support so we can fallback to CPU mode.
```
  6d84f075
- patch: always add token to cache_tokens (#2459) · 26b13fc3
  Jeffrey Morgan authored Feb 12, 2024
  
  26b13fc3
09 Feb, 2024 1 commit

Shutdown faster · 66807615

Daniel Hiltgen authored Feb 08, 2024

Make sure that when a shutdown signal comes, we shutdown quickly instead
of waiting for a potentially long exchange to wrap up.

66807615

08 Feb, 2024 1 commit

Ensure the libraries are present · a1dfab43

Daniel Hiltgen authored Feb 07, 2024

When we store our libraries in a temp dir, a reaper might clean
them when we are idle, so make sure to check for them before
we reload.

a1dfab43