Commits · c942e4a07b91dc6b78bb245241ea514b752e3d4d · OpenDAS / ollama

17 Apr, 2024 1 commit
- Fixed startup sequence to report model loading · c942e4a0
  ManniX-ITA authored Apr 17, 2024
  
  c942e4a0
16 Apr, 2024 1 commit
- Support unicode characters in model path (#3681) · 7c9792a6
  Jeffrey Morgan authored Apr 16, 2024
```
* parse wide argv characters on windows

* cleanup

* move cleanup to end of `main`
```
  7c9792a6
01 Apr, 2024 2 commits

Apply 01-cache.diff · 0a0e9f3e
Daniel Hiltgen authored Mar 19, 2024

0a0e9f3e

Switch back to subprocessing for llama.cpp · 58d95cc9

Daniel Hiltgen authored Mar 14, 2024

This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.

58d95cc9

26 Mar, 2024 1 commit
- add license in file header for vendored llama.cpp code (#3351) · f5ca7f8c
  Jeffrey Morgan authored Mar 26, 2024
  
  f5ca7f8c
23 Mar, 2024 1 commit
- Bump llama.cpp to b2474 · 43799532
  Daniel Hiltgen authored Mar 23, 2024
```
The release just before ggml-cuda.cu refactoring
```
  43799532
16 Mar, 2024 1 commit
- llama: remove server static assets (#3174) · e95ffc74
  Jeffrey Morgan authored Mar 15, 2024
  
  e95ffc74
12 Mar, 2024 3 commits
- Adapt our build for imported server.cpp · 85129d3a
  Daniel Hiltgen authored Mar 12, 2024
  
  85129d3a
- Import server.cpp as of b2356 · 9ac6440d
  Daniel Hiltgen authored Mar 12, 2024
  
  9ac6440d
- chore: fix typo (#3073) · 53c107e2
  racerole authored Mar 13, 2024
```
Signed-off-by: racerole <jiangyifeng@outlook.com>
```
  53c107e2
11 Mar, 2024 2 commits
- relay load model errors to the client (#3065) · b80661e8
  Bruce MacDonald authored Mar 11, 2024
  
  b80661e8
- update llama.cpp submodule to `ceca1ae` (#3064) · 369eda65
  Jeffrey Morgan authored Mar 11, 2024
  
  369eda65
09 Mar, 2024 1 commit
- update llama.cpp submodule to `77d1ac7` (#3030) · 1ffb1e28
  Jeffrey Morgan authored Mar 09, 2024
  
  1ffb1e28
08 Mar, 2024 1 commit
- update llama.cpp submodule to `6cdabe6` (#2999) · 0e4669b0
  Jeffrey Morgan authored Mar 08, 2024
  
  0e4669b0
01 Mar, 2024 1 commit
- update llama.cpp submodule to `c29af7e` (#2868) · 21347e1e
  Jeffrey Morgan authored Mar 01, 2024
  
  21347e1e
20 Feb, 2024 2 commits
- update llama.cpp submodule to `66c1968f7` (#2618) · 4613a080
  Jeffrey Morgan authored Feb 20, 2024
  
  4613a080
- [nit] Remove unused msg local var. (#2511) · 01ff2e14
  Taras Tsugrii authored Feb 20, 2024
  
  01ff2e14
14 Feb, 2024 1 commit
- set `shutting_down` to `false` once shutdown is complete (#2484) · f7231ad9
  Jeffrey Morgan authored Feb 13, 2024
  
  f7231ad9
09 Feb, 2024 1 commit

Shutdown faster · 66807615

Daniel Hiltgen authored Feb 08, 2024

Make sure that when a shutdown signal comes, we shutdown quickly instead
of waiting for a potentially long exchange to wrap up.

66807615

31 Jan, 2024 1 commit

Bump llama.cpp to b1999 · 72b12c3b

Daniel Hiltgen authored Jan 29, 2024

This requires an upstream change to support graceful termination,
carried as a patch.

72b12c3b

22 Jan, 2024 1 commit

Refine debug logging for llm · 730dcfcc

Daniel Hiltgen authored Jan 22, 2024

This wires up logging in llama.cpp to always go to stderr, and also
turns up logging if OLLAMA_DEBUG is set.

730dcfcc

21 Jan, 2024 1 commit

Probe GPUs before backend init · ec376453

Daniel Hiltgen authored Jan 21, 2024

Detect potential error scenarios so we can fallback to CPU mode without
hitting asserts.

ec376453

17 Jan, 2024 1 commit
- Add multiple CPU variants for Intel Mac · 1b249748
  Daniel Hiltgen authored Jan 12, 2024
```
This also refines the build process for the ext_server build.
```
  1b249748
14 Jan, 2024 1 commit
- Disable `mmap` with lora layers (#1985) · 557110d0
  Jeffrey Morgan authored Jan 13, 2024
  
  557110d0
11 Jan, 2024 1 commit

Support multiple variants for a given llm lib type · 8da7bef0

Daniel Hiltgen authored Jan 05, 2024

In some cases we may want multiple variants for a given GPU type or CPU.
This adds logic to have an optional Variant which we can use to select
an optimal library, but also allows us to try multiple variants in case
some fail to load.

This can be useful for scenarios such as ROCm v5 vs v6 incompatibility
or potentially CPU features.

8da7bef0

10 Jan, 2024 1 commit

Update submodule to `6efb8eb30e7025b168f3fda3ff83b9b386428ad6` (#1885) · 2c6e8f52

Jeffrey Morgan authored Jan 10, 2024

* update submodule to `6efb8eb30e7025b168f3fda3ff83b9b386428ad6`
* unblock condition variable in `update_slots` when closing server

2c6e8f52

07 Jan, 2024 1 commit
- add `-DCMAKE_SYSTEM_NAME=Darwin` cmake flag (#1832) · dbdd50b2
  Jeffrey Morgan authored Jan 07, 2024
  
  dbdd50b2
04 Jan, 2024 1 commit
- Code shuffle to clean up the llm dir · 77d96da9
  Daniel Hiltgen authored Jan 04, 2024
  
  77d96da9