Commits · c9167494cbc3de3771ab7b7c10b15caa795229d0 · OpenDAS / ollama

23 Oct, 2023 1 commit
- update default log target · c9167494
  Michael Yang authored Oct 23, 2023
  
  c9167494
19 Oct, 2023 2 commits
- simpler check for model loading compatibility errors · 7ed5a39b
  Jeffrey Morgan authored Oct 19, 2023
  
  7ed5a39b
- add error for `falcon` and `starcoder` vocab compatibility (#844) · a7dad24d
  Jeffrey Morgan authored Oct 19, 2023
```
add error for falcon and starcoder vocab compatibility
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
```
  a7dad24d
18 Oct, 2023 6 commits
- use TrimPrefix instead of TrimLeft · 730996e5
  Arne Müller authored Oct 18, 2023
  
  730996e5
- removed redundant strings.CutPrefix from Decode · ce6197a8
  Arne Müller authored Oct 18, 2023
  
  ce6197a8
- use strings.TrimLeft to remove spaces · 46b9953f
  Arne Müller authored Oct 18, 2023
  
  46b9953f
- relay CUDA errors to the client (#825) · 565648f3
  Bruce MacDonald authored Oct 18, 2023
  
  565648f3
- moved removal of leading space into Predict · 90c49bed
  Arne Müller authored Oct 18, 2023
  
  90c49bed
- fix whitespace removal · 5dc0cff4
  Arne Müller authored Oct 18, 2023
  
  5dc0cff4
17 Oct, 2023 5 commits
- use cut prefix · b36b0b71
  Michael Yang authored Oct 16, 2023
  
  b36b0b71
- remove unused struct · 094df375
  Michael Yang authored Oct 16, 2023
  
  094df375
- Update llama.cpp gguf to latest (#710) · f3648fd2
  Bruce MacDonald authored Oct 17, 2023
  
  f3648fd2
- fix MB VRAM log output (#824) · bd93a94a
  Bruce MacDonald authored Oct 17, 2023
  
  bd93a94a
- Removed newline trimming and used buffer directly in POST request. · 8fa3f366
  Arne Müller authored Oct 17, 2023
  
  8fa3f366
16 Oct, 2023 3 commits
- fix: format string wrong type · fddb303f
  Michael Yang authored Oct 16, 2023
  
  fddb303f
- fix: regression unsupported metal types · cb4a80b6
  Michael Yang authored Oct 16, 2023
```
omitting `--n-gpu-layers` means use metal on macos which isn't correct
since ollama uses `num_gpu=0` to explicitly disable gpu for file types
that are not implemented in metal
```
  cb4a80b6
- handling unescaped json marshaling · ee94693b
  Arne Müller authored Oct 16, 2023
  
  ee94693b
13 Oct, 2023 8 commits
- update checkvram · 11d82d7b
  Michael Yang authored Oct 13, 2023
  
  11d82d7b
- only check system memory on macos · 36fe2dee
  Michael Yang authored Oct 13, 2023
  
  36fe2dee
- check total (system + video) memory · 4a8931f6
  Michael Yang authored Oct 12, 2023
  
  4a8931f6
- refactor memory check · bd6e38fb
  Michael Yang authored Oct 12, 2023
  
  bd6e38fb
- fix memory check · 92189a58
  Michael Yang authored Oct 12, 2023
  
  92189a58
- do not use gpu binary when num_gpu == 0 · 35afac09
  Michael Yang authored Oct 13, 2023
  
  35afac09
- no gpu if vram < 2GB · 811c3d19
  Michael Yang authored Oct 13, 2023
  
  811c3d19
- improve api error handling (#781) · 6fe17813
  Bruce MacDonald authored Oct 13, 2023
```
- remove new lines from llama.cpp error messages relayed to client
- check api option types and return error on wrong type
- change num layers from 95% VRAM to 92% VRAM
```
  6fe17813
12 Oct, 2023 1 commit

relay model runner error message to client (#720) · 56497663

Bruce MacDonald authored Oct 12, 2023

* give direction to user when runner fails
* also relay errors from timeout
* increase timeout to 3 minutes

56497663

11 Oct, 2023 2 commits
- add format bytes · b599946b
  Michael Yang authored Oct 11, 2023
  
  b599946b
- prevent waiting on exited command (#752) · 77295f71
  Bruce MacDonald authored Oct 11, 2023
```
* prevent waiting on exited command
* close llama runner once
```
  77295f71
10 Oct, 2023 1 commit
- improve vram safety with 5% vram memory buffer (#724) · f2ba1311
  Bruce MacDonald authored Oct 10, 2023
```
* check free memory not total
* wait for subprocess to exit
```
  f2ba1311
06 Oct, 2023 2 commits
- llm: fix build on `amd64` · ab066829
  Jeffrey Morgan authored Oct 06, 2023
  
  ab066829
- rename server subprocess (#700) · 5d22319a
  Bruce MacDonald authored Oct 06, 2023
```
- this makes it easier to see that the subprocess is associated with ollama
```
  5d22319a
05 Oct, 2023 1 commit
- enable q8, q5, 5_1, and f32 for linux gpu (#699) · d06bc0cb
  Bruce MacDonald authored Oct 05, 2023
  
  d06bc0cb
04 Oct, 2023 1 commit
- increase streaming buffer size (#692) · 9e2de1bd
  Bruce MacDonald authored Oct 04, 2023
  
  9e2de1bd
03 Oct, 2023 1 commit
- starcoder · c02c0cd4
  Michael Yang authored Oct 02, 2023
  
  c02c0cd4
02 Oct, 2023 2 commits

clean up num_gpu calculation code (#673) · b1f71233
Bruce MacDonald authored Oct 02, 2023

b1f71233

Relay default values to llama runner (#672) · 1fbf3585

Bruce MacDonald authored Oct 02, 2023



* include seed in params for llama.cpp server and remove empty filter for temp

* relay default predict options to llama.cpp

- reorganize options to match predict request for readability

* omit empty stop

---------
Co-authored-by: hallh <hallh@users.noreply.github.com>

1fbf3585

29 Sep, 2023 1 commit
- windows runner fixes (#637) · 9771b1ec
  Bruce MacDonald authored Sep 29, 2023
  
  9771b1ec
28 Sep, 2023 1 commit
- use int64 consistently · f40b3de7
  Michael Yang authored Sep 28, 2023
  
  f40b3de7
25 Sep, 2023 1 commit
- unbound max num gpu layers (#591) · 86279f4a
  Bruce MacDonald authored Sep 25, 2023
```
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  86279f4a
21 Sep, 2023 1 commit
- silence warm up log · 058d0cd0
  Michael Yang authored Sep 21, 2023
  
  058d0cd0