Commits · 77295f716ef3f4cade6d93232d3b933db7c57dd7 · OpenDAS / ollama

11 Oct, 2023 1 commit
- prevent waiting on exited command (#752) · 77295f71
  Bruce MacDonald authored Oct 11, 2023
```
* prevent waiting on exited command
* close llama runner once
```
  77295f71
10 Oct, 2023 1 commit
- improve vram safety with 5% vram memory buffer (#724) · f2ba1311
  Bruce MacDonald authored Oct 10, 2023
```
* check free memory not total
* wait for subprocess to exit
```
  f2ba1311
06 Oct, 2023 1 commit
- rename server subprocess (#700) · 5d22319a
  Bruce MacDonald authored Oct 06, 2023
```
- this makes it easier to see that the subprocess is associated with ollama
```
  5d22319a
04 Oct, 2023 1 commit
- increase streaming buffer size (#692) · 9e2de1bd
  Bruce MacDonald authored Oct 04, 2023
  
  9e2de1bd
03 Oct, 2023 1 commit
- starcoder · c02c0cd4
  Michael Yang authored Oct 02, 2023
  
  c02c0cd4
02 Oct, 2023 2 commits

clean up num_gpu calculation code (#673) · b1f71233
Bruce MacDonald authored Oct 02, 2023

b1f71233

Relay default values to llama runner (#672) · 1fbf3585

Bruce MacDonald authored Oct 02, 2023



* include seed in params for llama.cpp server and remove empty filter for temp

* relay default predict options to llama.cpp

- reorganize options to match predict request for readability

* omit empty stop

---------
Co-authored-by: hallh <hallh@users.noreply.github.com>

1fbf3585

29 Sep, 2023 1 commit
- windows runner fixes (#637) · 9771b1ec
  Bruce MacDonald authored Sep 29, 2023
  
  9771b1ec
28 Sep, 2023 1 commit
- use int64 consistently · f40b3de7
  Michael Yang authored Sep 28, 2023
  
  f40b3de7
25 Sep, 2023 1 commit
- unbound max num gpu layers (#591) · 86279f4a
  Bruce MacDonald authored Sep 25, 2023
```
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  86279f4a
21 Sep, 2023 1 commit

remove tmp directories created by previous servers (#559) · 4cba75ef

Bruce MacDonald authored Sep 21, 2023



* remove tmp directories created by previous servers

* clean up on server stop

* Update routes.go

* Update server/routes.go
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* create top-level temp ollama dir

* check file exists before creating

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>

4cba75ef

20 Sep, 2023 2 commits
- only package 11.8 runner · 1255bc9b
  Bruce MacDonald authored Sep 20, 2023
  
  1255bc9b
- pack in cuda libs · 4e8be787
  Bruce MacDonald authored Sep 20, 2023
  
  4e8be787
18 Sep, 2023 1 commit

subprocess improvements (#524) · 66003e1d

Bruce MacDonald authored Sep 18, 2023

* subprocess improvements

- increase start-up timeout
- when runner fails to start fail rather than timing out
- try runners in order rather than choosing 1 runner
- embed metal runner in metal dir rather than gpu
- refactor logging and error messages

* Update llama.go

* Update llama.go

* simplify by using glob

66003e1d

14 Sep, 2023 1 commit

support for packaging in multiple cuda runners (#509) · 2540c918

Bruce MacDonald authored Sep 14, 2023



* enable packaging multiple cuda versions
* use nvcc cuda version if available

---------
Co-authored-by: Michael Yang <mxyng@pm.me>

2540c918

12 Sep, 2023 2 commits

fix falcon decode · 7dee25a0
Michael Yang authored Sep 12, 2023
```
get model and file type from bin file
```
7dee25a0

first pass at linux gpu support (#454) · f2216370

Bruce MacDonald authored Sep 12, 2023



* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------
Co-authored-by: Michael Yang <mxyng@pm.me>

f2216370

07 Sep, 2023 1 commit
- GGUF support (#441) · 09dd2aef
  Bruce MacDonald authored Sep 07, 2023
  
  09dd2aef
06 Sep, 2023 2 commits
- use `osPath` in gpu check · 7de30085
  Jeffrey Morgan authored Sep 05, 2023
  
  7de30085
- macos `amd64` compatibility fixes · 213ffdb5
  Jeffrey Morgan authored Sep 05, 2023
  
  213ffdb5
05 Sep, 2023 1 commit
- fix empty response · 2bc06565
  Michael Yang authored Sep 05, 2023
  
  2bc06565
03 Sep, 2023 2 commits
- fix not forwarding last token · 59a70552
  Michael Yang authored Sep 03, 2023
  
  59a70552
- remove marshalPrompt which is no longer needed · 5d3f314b
  Michael Yang authored Sep 03, 2023
  
  5d3f314b
30 Aug, 2023 1 commit

subprocess llama.cpp server (#401) · 42998d79

Bruce MacDonald authored Aug 30, 2023

* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm

42998d79