Commits · 3a1ed9ff707d1ac4d525e9dfb2a6002c4305bc62 · OpenDAS / ollama

27 Oct, 2023 3 commits
- restore building runner with `AVX` on by default (#900) · 3a1ed9ff
  Jeffrey Morgan authored Oct 27, 2023
  
  3a1ed9ff
- catch insufficient permissions nvidia err (#934) · 6d283882
  Bruce MacDonald authored Oct 27, 2023
  
  6d283882
- offload 75% of available vram to improve stability (#921) · 2665f3c2
  Bruce MacDonald authored Oct 26, 2023
  
  2665f3c2
24 Oct, 2023 3 commits
- fix metal assertion errors · b0c9cd0f
  Jeffrey Morgan authored Oct 24, 2023
  
  b0c9cd0f
- update submodule commit · 77f61c63
  Jeffrey Morgan authored Oct 24, 2023
  
  77f61c63
- update submodule commit · f3604534
  Jeffrey Morgan authored Oct 23, 2023
  
  f3604534
23 Oct, 2023 3 commits

Michael Yang authored Oct 23, 2023

pin to 9e70cc03229df19ca2d28ce23cc817198f897278 for now since
438c2ca83045a00ef244093d27e9ed41a8cb4ea9 is breaking

0c7a00a2

update default log target · c9167494
Michael Yang authored Oct 23, 2023

c9167494

ggufv3 · 125d0a01

Michael Yang authored Oct 23, 2023

ggufv3 adds support for big endianness, mainly for s390x architecture.
while that's not currently supported for ollama, the change is simple.

loosen version check to be more forward compatible. unless specified,
gguf versions other v1 will be decoded into v2.

125d0a01

19 Oct, 2023 2 commits
- simpler check for model loading compatibility errors · 7ed5a39b
  Jeffrey Morgan authored Oct 19, 2023
  
  7ed5a39b
- add error for `falcon` and `starcoder` vocab compatibility (#844) · a7dad24d
  Jeffrey Morgan authored Oct 19, 2023
```
add error for falcon and starcoder vocab compatibility
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
```
  a7dad24d
18 Oct, 2023 6 commits
- use TrimPrefix instead of TrimLeft · 730996e5
  Arne Müller authored Oct 18, 2023
  
  730996e5
- removed redundant strings.CutPrefix from Decode · ce6197a8
  Arne Müller authored Oct 18, 2023
  
  ce6197a8
- use strings.TrimLeft to remove spaces · 46b9953f
  Arne Müller authored Oct 18, 2023
  
  46b9953f
- relay CUDA errors to the client (#825) · 565648f3
  Bruce MacDonald authored Oct 18, 2023
  
  565648f3
- moved removal of leading space into Predict · 90c49bed
  Arne Müller authored Oct 18, 2023
  
  90c49bed
- fix whitespace removal · 5dc0cff4
  Arne Müller authored Oct 18, 2023
  
  5dc0cff4
17 Oct, 2023 5 commits
- use cut prefix · b36b0b71
  Michael Yang authored Oct 16, 2023
  
  b36b0b71
- remove unused struct · 094df375
  Michael Yang authored Oct 16, 2023
  
  094df375
- Update llama.cpp gguf to latest (#710) · f3648fd2
  Bruce MacDonald authored Oct 17, 2023
  
  f3648fd2
- fix MB VRAM log output (#824) · bd93a94a
  Bruce MacDonald authored Oct 17, 2023
  
  bd93a94a
- Removed newline trimming and used buffer directly in POST request. · 8fa3f366
  Arne Müller authored Oct 17, 2023
  
  8fa3f366
16 Oct, 2023 3 commits
- fix: format string wrong type · fddb303f
  Michael Yang authored Oct 16, 2023
  
  fddb303f
- fix: regression unsupported metal types · cb4a80b6
  Michael Yang authored Oct 16, 2023
```
omitting `--n-gpu-layers` means use metal on macos which isn't correct
since ollama uses `num_gpu=0` to explicitly disable gpu for file types
that are not implemented in metal
```
  cb4a80b6
- handling unescaped json marshaling · ee94693b
  Arne Müller authored Oct 16, 2023
  
  ee94693b
13 Oct, 2023 8 commits
- update checkvram · 11d82d7b
  Michael Yang authored Oct 13, 2023
  
  11d82d7b
- only check system memory on macos · 36fe2dee
  Michael Yang authored Oct 13, 2023
  
  36fe2dee
- check total (system + video) memory · 4a8931f6
  Michael Yang authored Oct 12, 2023
  
  4a8931f6
- refactor memory check · bd6e38fb
  Michael Yang authored Oct 12, 2023
  
  bd6e38fb
- fix memory check · 92189a58
  Michael Yang authored Oct 12, 2023
  
  92189a58
- do not use gpu binary when num_gpu == 0 · 35afac09
  Michael Yang authored Oct 13, 2023
  
  35afac09
- no gpu if vram < 2GB · 811c3d19
  Michael Yang authored Oct 13, 2023
  
  811c3d19
- improve api error handling (#781) · 6fe17813
  Bruce MacDonald authored Oct 13, 2023
```
- remove new lines from llama.cpp error messages relayed to client
- check api option types and return error on wrong type
- change num layers from 95% VRAM to 92% VRAM
```
  6fe17813
12 Oct, 2023 1 commit

relay model runner error message to client (#720) · 56497663

Bruce MacDonald authored Oct 12, 2023

* give direction to user when runner fails
* also relay errors from timeout
* increase timeout to 3 minutes

56497663

11 Oct, 2023 2 commits
- add format bytes · b599946b
  Michael Yang authored Oct 11, 2023
  
  b599946b
- prevent waiting on exited command (#752) · 77295f71
  Bruce MacDonald authored Oct 11, 2023
```
* prevent waiting on exited command
* close llama runner once
```
  77295f71
10 Oct, 2023 1 commit
- improve vram safety with 5% vram memory buffer (#724) · f2ba1311
  Bruce MacDonald authored Oct 10, 2023
```
* check free memory not total
* wait for subprocess to exit
```
  f2ba1311
06 Oct, 2023 2 commits
- llm: fix build on `amd64` · ab066829
  Jeffrey Morgan authored Oct 06, 2023
  
  ab066829
- rename server subprocess (#700) · 5d22319a
  Bruce MacDonald authored Oct 06, 2023
```
- this makes it easier to see that the subprocess is associated with ollama
```
  5d22319a
05 Oct, 2023 1 commit
- enable q8, q5, 5_1, and f32 for linux gpu (#699) · d06bc0cb
  Bruce MacDonald authored Oct 05, 2023
  
  d06bc0cb