Commits · 0a9d34802375641d72279fd9870e2ddfd4120fbf · OpenDAS / ollama

12 Dec, 2023 2 commits
- exponential back-off (#1484) · 3144e2a4
  Bruce MacDonald authored Dec 12, 2023
  
  3144e2a4
- retry on concurrent request failure (#1483) · c0960e29
  Bruce MacDonald authored Dec 12, 2023
```
- remove parallel
```
  c0960e29
11 Dec, 2023 1 commit

Patrick Devine authored Dec 11, 2023




---------
Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>

910e9401

10 Dec, 2023 4 commits
- fix model name returned by `/api/generate` being different than the model name provided · fa2f095b
  Jeffrey Morgan authored Dec 10, 2023
  
  fa2f095b
- seek to end of file when decoding older model formats · d9a250e9
  Jeffrey Morgan authored Dec 09, 2023
  
  d9a250e9
- seek to eof for older model binaries · 944519ed
  Jeffrey Morgan authored Dec 09, 2023
  
  944519ed
- do not use `--parallel 2` for old runners · 2dd040d0
  Jeffrey Morgan authored Dec 09, 2023
  
  2dd040d0
09 Dec, 2023 1 commit

fix: parallel queueing race condition caused silent failure (#1445) · bbe41ce4

Bruce MacDonald authored Dec 09, 2023

* fix: queued request failures

- increase parallel requests to 2 to complete queued request, queueing is managed in ollama

* log steam errors

bbe41ce4

05 Dec, 2023 7 commits
- load projectors · b9495ea1
  Michael Yang authored Nov 30, 2023
  
  b9495ea1
- chat api endpoint (#1392) · 195e3d9d
  Bruce MacDonald authored Dec 05, 2023
  
  195e3d9d
- Revert "chat api (#991)" while context variable is fixed · 00d06619
  Jeffrey Morgan authored Dec 04, 2023
```
This reverts commit 7a0899d6.
```
  00d06619
- comments · 5a5dca13
  Michael Yang authored Nov 29, 2023
  
  5a5dca13
- seek instead of copyn · 72e7a49a
  Michael Yang authored Nov 29, 2023
  
  72e7a49a
- split from into one or more models · 2cb0fa7d
  Michael Yang authored Nov 24, 2023
  
  2cb0fa7d
- unnecessary ReadSeeker for DecodeGGML · b2816bca
  Michael Yang authored Nov 22, 2023
  
  b2816bca
04 Dec, 2023 2 commits

chat api (#991) · 7a0899d6

Bruce MacDonald authored Dec 04, 2023

- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history

7a0899d6

update for qwen · 6deebf24
Michael Yang authored Dec 04, 2023

6deebf24

26 Nov, 2023 2 commits
- add back `f16c` instructions on intel mac · 16a90063
  Jeffrey Morgan authored Nov 26, 2023
  
  16a90063
- update submodule commit · 9e4a3164
  Jeffrey Morgan authored Nov 26, 2023
  
  9e4a3164
24 Nov, 2023 2 commits

windows CUDA support (#1262) · 82b9b329

Jing Zhang authored Nov 24, 2023

* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows

82b9b329

Disable CUDA peer access as a workaround for multi-gpu inference bug (#1261) · 12e8c12d

Jongwook Choi authored Nov 24, 2023

When CUDA peer access is enabled, multi-gpu inference will produce
garbage output. This is a known bug of llama.cpp (or nvidia). Until the
upstream bug is fixed, we can disable CUDA peer access temporarily
to ensure correct output.

See #961.

12e8c12d

22 Nov, 2023 2 commits
- consistent cpu instructions on macos and linux · d77dde12
  Jeffrey Morgan authored Nov 22, 2023
  
  d77dde12
- fix: gguf int type · 199941cd
  Michael Yang authored Nov 22, 2023
  
  199941cd
21 Nov, 2023 2 commits
- update llama.cpp · a00fac4e
  Michael Yang authored Nov 21, 2023
  
  a00fac4e
- only set `main_gpu` if value > 0 is provided · a3fcecf9
  Jeffrey Morgan authored Nov 20, 2023
  
  a3fcecf9
20 Nov, 2023 3 commits
- recent llama.cpp update added kernels for fp32, q5_0, and q5_1 · 19b7a4d7
  Michael Yang authored Nov 20, 2023
  
  19b7a4d7
- main-gpu argument is not getting passed to llamacpp, fixed. (#1192) · be61a817
  Purinda Gunasekara authored Nov 21, 2023
  
  be61a817
- enable cpu instructions on intel macs · 13ba6df5
  Jeffrey Morgan authored Nov 19, 2023
  
  13ba6df5
19 Nov, 2023 2 commits
- Update llm/llama.go · 36a3bbf6
  Jeffrey Morgan authored Nov 18, 2023
  
  36a3bbf6
- fix potentially inaccurate error message · 43a72614
  Bruce MacDonald authored Nov 17, 2023
  
  43a72614
17 Nov, 2023 1 commit
- build intel mac with correct binary and compile flags · 41434a7c
  Jeffrey Morgan authored Nov 16, 2023
  
  41434a7c
10 Nov, 2023 1 commit

JSON mode: add `"format" as an api parameter (#1051) · 5cba29b9

Jeffrey Morgan authored Nov 09, 2023



* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

5cba29b9

09 Nov, 2023 2 commits
- skip gpu if less than 2GB VRAM are available (#1059) · 1ae84bc2
  Bruce MacDonald authored Nov 09, 2023
  
  1ae84bc2
- instead of static number of parameters for each model family, get the real... · c5e1bbab
  Michael Yang authored Nov 08, 2023
```
instead of static number of parameters for each model family, get the real number from the tensors (#1022)

* parse tensor info

* refactor decoder

* return actual parameter count

* explicit rounding

* s/Human/HumanNumber/
```
  c5e1bbab
04 Nov, 2023 1 commit
- remove unused `fmt.Println` · c44b6194
  Jeffrey Morgan authored Nov 03, 2023
  
  c44b6194
03 Nov, 2023 1 commit
- Restore system prompt on requests and default `num_keep` to `0` · 17678b72
  Jeffrey Morgan authored Nov 03, 2023
  
  17678b72
02 Nov, 2023 1 commit
- default rope params to 0 for new models (#968) · 2e537046
  Jeffrey Morgan authored Nov 02, 2023
  
  2e537046
31 Oct, 2023 1 commit
- append LD_LIBRARY_PATH · 642128b7
  Michael Yang authored Oct 31, 2023
  
  642128b7
27 Oct, 2023 2 commits
- restore building runner with `AVX` on by default (#900) · 3a1ed9ff
  Jeffrey Morgan authored Oct 27, 2023
  
  3a1ed9ff
- catch insufficient permissions nvidia err (#934) · 6d283882
  Bruce MacDonald authored Oct 27, 2023
  
  6d283882