Commits · 9ef2fce33a4625da88d4201cdb5e3a074e268def · OpenDAS / ollama

13 Oct, 2023 8 commits
- update checkvram · 11d82d7b
  Michael Yang authored Oct 13, 2023
  
  11d82d7b
- only check system memory on macos · 36fe2dee
  Michael Yang authored Oct 13, 2023
  
  36fe2dee
- check total (system + video) memory · 4a8931f6
  Michael Yang authored Oct 12, 2023
  
  4a8931f6
- refactor memory check · bd6e38fb
  Michael Yang authored Oct 12, 2023
  
  bd6e38fb
- fix memory check · 92189a58
  Michael Yang authored Oct 12, 2023
  
  92189a58
- do not use gpu binary when num_gpu == 0 · 35afac09
  Michael Yang authored Oct 13, 2023
  
  35afac09
- no gpu if vram < 2GB · 811c3d19
  Michael Yang authored Oct 13, 2023
  
  811c3d19
- improve api error handling (#781) · 6fe17813
  Bruce MacDonald authored Oct 13, 2023
```
- remove new lines from llama.cpp error messages relayed to client
- check api option types and return error on wrong type
- change num layers from 95% VRAM to 92% VRAM
```
  6fe17813
12 Oct, 2023 1 commit

relay model runner error message to client (#720) · 56497663

Bruce MacDonald authored Oct 12, 2023

* give direction to user when runner fails
* also relay errors from timeout
* increase timeout to 3 minutes

56497663

11 Oct, 2023 2 commits
- add format bytes · b599946b
  Michael Yang authored Oct 11, 2023
  
  b599946b
- prevent waiting on exited command (#752) · 77295f71
  Bruce MacDonald authored Oct 11, 2023
```
* prevent waiting on exited command
* close llama runner once
```
  77295f71
10 Oct, 2023 1 commit
- improve vram safety with 5% vram memory buffer (#724) · f2ba1311
  Bruce MacDonald authored Oct 10, 2023
```
* check free memory not total
* wait for subprocess to exit
```
  f2ba1311
06 Oct, 2023 2 commits
- llm: fix build on `amd64` · ab066829
  Jeffrey Morgan authored Oct 06, 2023
  
  ab066829
- rename server subprocess (#700) · 5d22319a
  Bruce MacDonald authored Oct 06, 2023
```
- this makes it easier to see that the subprocess is associated with ollama
```
  5d22319a
05 Oct, 2023 1 commit
- enable q8, q5, 5_1, and f32 for linux gpu (#699) · d06bc0cb
  Bruce MacDonald authored Oct 05, 2023
  
  d06bc0cb
04 Oct, 2023 1 commit
- increase streaming buffer size (#692) · 9e2de1bd
  Bruce MacDonald authored Oct 04, 2023
  
  9e2de1bd
03 Oct, 2023 1 commit
- starcoder · c02c0cd4
  Michael Yang authored Oct 02, 2023
  
  c02c0cd4
02 Oct, 2023 2 commits

clean up num_gpu calculation code (#673) · b1f71233
Bruce MacDonald authored Oct 02, 2023

b1f71233

Relay default values to llama runner (#672) · 1fbf3585

Bruce MacDonald authored Oct 02, 2023



* include seed in params for llama.cpp server and remove empty filter for temp

* relay default predict options to llama.cpp

- reorganize options to match predict request for readability

* omit empty stop

---------
Co-authored-by: hallh <hallh@users.noreply.github.com>

1fbf3585

29 Sep, 2023 1 commit
- windows runner fixes (#637) · 9771b1ec
  Bruce MacDonald authored Sep 29, 2023
  
  9771b1ec
28 Sep, 2023 1 commit
- use int64 consistently · f40b3de7
  Michael Yang authored Sep 28, 2023
  
  f40b3de7
25 Sep, 2023 1 commit
- unbound max num gpu layers (#591) · 86279f4a
  Bruce MacDonald authored Sep 25, 2023
```
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  86279f4a
21 Sep, 2023 3 commits

silence warm up log · 058d0cd0
Michael Yang authored Sep 21, 2023

058d0cd0
update submodule (#567) · ee1c994d
Michael Yang authored Sep 21, 2023

ee1c994d

remove tmp directories created by previous servers (#559) · 4cba75ef

Bruce MacDonald authored Sep 21, 2023



* remove tmp directories created by previous servers

* clean up on server stop

* Update routes.go

* Update server/routes.go
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* create top-level temp ollama dir

* check file exists before creating

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>

4cba75ef

20 Sep, 2023 6 commits
- rename generate.go · a9ed7cc6
  Michael Yang authored Sep 20, 2023
  
  a9ed7cc6
- embed libraries using cmake · 6c6a31a1
  Michael Yang authored Sep 20, 2023
  
  6c6a31a1
- remove libcuda.so · fc6ec356
  Bruce MacDonald authored Sep 20, 2023
  
  fc6ec356
- only package 11.8 runner · 1255bc9b
  Bruce MacDonald authored Sep 20, 2023
  
  1255bc9b
- use cuda_version · b9bb5ca2
  Bruce MacDonald authored Sep 20, 2023
  
  b9bb5ca2
- pack in cuda libs · 4e8be787
  Bruce MacDonald authored Sep 20, 2023
  
  4e8be787
18 Sep, 2023 1 commit

subprocess improvements (#524) · 66003e1d

Bruce MacDonald authored Sep 18, 2023

* subprocess improvements

- increase start-up timeout
- when runner fails to start fail rather than timing out
- try runners in order rather than choosing 1 runner
- embed metal runner in metal dir rather than gpu
- refactor logging and error messages

* Update llama.go

* Update llama.go

* simplify by using glob

66003e1d

14 Sep, 2023 1 commit

support for packaging in multiple cuda runners (#509) · 2540c918

Bruce MacDonald authored Sep 14, 2023



* enable packaging multiple cuda versions
* use nvcc cuda version if available

---------
Co-authored-by: Michael Yang <mxyng@pm.me>

2540c918

13 Sep, 2023 1 commit
- fix: add falcon.go · d0288538
  Michael Yang authored Sep 13, 2023
  
  d0288538
12 Sep, 2023 4 commits
- fix model type for 70b · 0c5a4543
  Michael Yang authored Sep 12, 2023
  
  0c5a4543
- fix ggml arm64 cuda build (#520) · f59c4d03
  Bruce MacDonald authored Sep 12, 2023
  
  f59c4d03
- fix falcon decode · 7dee25a0
  Michael Yang authored Sep 12, 2023
```
get model and file type from bin file
```
  7dee25a0
- first pass at linux gpu support (#454) · f2216370
  Bruce MacDonald authored Sep 12, 2023
```
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  f2216370
07 Sep, 2023 1 commit
- GGUF support (#441) · 09dd2aef
  Bruce MacDonald authored Sep 07, 2023
  
  09dd2aef
06 Sep, 2023 1 commit
- set minimum `CMAKE_OSX_DEPLOYMENT_TARGET` to 11.0 · 61dda6a5
  Jeffrey Morgan authored Sep 06, 2023
  
  61dda6a5