Commits · d4cd6957598ba6a3a1bb4e2660ee24b82e2541da · OpenDAS / ollama

19 Dec, 2023 2 commits

Add cgo implementation for llama.cpp · d4cd6957

Daniel Hiltgen authored Nov 13, 2023

Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.

d4cd6957

deprecate ggml · 811b1f03

Bruce MacDonald authored Nov 24, 2023



- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>

811b1f03

18 Dec, 2023 3 commits
- update runner submodule · 6b5bdfa6
  Jeffrey Morgan authored Dec 18, 2023
  
  6b5bdfa6
- update runner submodule to fix hipblas build · c063ee4a
  Jeffrey Morgan authored Dec 18, 2023
  
  c063ee4a
- update runner submodule · b85982eb
  Jeffrey Morgan authored Dec 18, 2023
  
  b85982eb
14 Dec, 2023 1 commit

restore model load duration on generate response (#1524) · 6ee8c801

Bruce MacDonald authored Dec 14, 2023

* restore model load duration on generate response

- set model load duration on generate and chat done response
- calculate createAt time when response created

* remove checkpoints predict opts

* Update routes.go

6ee8c801

13 Dec, 2023 1 commit
- Update runner to support mixtral and mixture of experts (MoE) (#1475) · 31f0551d
  Jeffrey Morgan authored Dec 13, 2023
  
  31f0551d
12 Dec, 2023 2 commits
- exponential back-off (#1484) · 3144e2a4
  Bruce MacDonald authored Dec 12, 2023
  
  3144e2a4
- retry on concurrent request failure (#1483) · c0960e29
  Bruce MacDonald authored Dec 12, 2023
```
- remove parallel
```
  c0960e29
11 Dec, 2023 2 commits

Multimodal support (#1216) · 910e9401

Patrick Devine authored Dec 11, 2023




---------
Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>

910e9401

remove per-model types · 56ffc302

Michael Yang authored Dec 08, 2023

mostly replaced by decoding tensors except ggml models which only
support llama

56ffc302

10 Dec, 2023 4 commits
- fix model name returned by `/api/generate` being different than the model name provided · fa2f095b
  Jeffrey Morgan authored Dec 10, 2023
  
  fa2f095b
- seek to end of file when decoding older model formats · d9a250e9
  Jeffrey Morgan authored Dec 09, 2023
  
  d9a250e9
- seek to eof for older model binaries · 944519ed
  Jeffrey Morgan authored Dec 09, 2023
  
  944519ed
- do not use `--parallel 2` for old runners · 2dd040d0
  Jeffrey Morgan authored Dec 09, 2023
  
  2dd040d0
09 Dec, 2023 1 commit

fix: parallel queueing race condition caused silent failure (#1445) · bbe41ce4

Bruce MacDonald authored Dec 09, 2023

* fix: queued request failures

- increase parallel requests to 2 to complete queued request, queueing is managed in ollama

* log steam errors

bbe41ce4

05 Dec, 2023 7 commits
- load projectors · b9495ea1
  Michael Yang authored Nov 30, 2023
  
  b9495ea1
- chat api endpoint (#1392) · 195e3d9d
  Bruce MacDonald authored Dec 05, 2023
  
  195e3d9d
- Revert "chat api (#991)" while context variable is fixed · 00d06619
  Jeffrey Morgan authored Dec 04, 2023
```
This reverts commit 7a0899d6.
```
  00d06619
- comments · 5a5dca13
  Michael Yang authored Nov 29, 2023
  
  5a5dca13
- seek instead of copyn · 72e7a49a
  Michael Yang authored Nov 29, 2023
  
  72e7a49a
- split from into one or more models · 2cb0fa7d
  Michael Yang authored Nov 24, 2023
  
  2cb0fa7d
- unnecessary ReadSeeker for DecodeGGML · b2816bca
  Michael Yang authored Nov 22, 2023
  
  b2816bca
04 Dec, 2023 2 commits

chat api (#991) · 7a0899d6

Bruce MacDonald authored Dec 04, 2023

- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history

7a0899d6

update for qwen · 6deebf24
Michael Yang authored Dec 04, 2023

6deebf24

26 Nov, 2023 2 commits
- add back `f16c` instructions on intel mac · 16a90063
  Jeffrey Morgan authored Nov 26, 2023
  
  16a90063
- update submodule commit · 9e4a3164
  Jeffrey Morgan authored Nov 26, 2023
  
  9e4a3164
24 Nov, 2023 2 commits

windows CUDA support (#1262) · 82b9b329

Jing Zhang authored Nov 24, 2023

* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows

82b9b329

Disable CUDA peer access as a workaround for multi-gpu inference bug (#1261) · 12e8c12d

Jongwook Choi authored Nov 24, 2023

When CUDA peer access is enabled, multi-gpu inference will produce
garbage output. This is a known bug of llama.cpp (or nvidia). Until the
upstream bug is fixed, we can disable CUDA peer access temporarily
to ensure correct output.

See #961.

12e8c12d

22 Nov, 2023 2 commits
- consistent cpu instructions on macos and linux · d77dde12
  Jeffrey Morgan authored Nov 22, 2023
  
  d77dde12
- fix: gguf int type · 199941cd
  Michael Yang authored Nov 22, 2023
  
  199941cd
21 Nov, 2023 2 commits
- update llama.cpp · a00fac4e
  Michael Yang authored Nov 21, 2023
  
  a00fac4e
- only set `main_gpu` if value > 0 is provided · a3fcecf9
  Jeffrey Morgan authored Nov 20, 2023
  
  a3fcecf9
20 Nov, 2023 3 commits
- recent llama.cpp update added kernels for fp32, q5_0, and q5_1 · 19b7a4d7
  Michael Yang authored Nov 20, 2023
  
  19b7a4d7
- main-gpu argument is not getting passed to llamacpp, fixed. (#1192) · be61a817
  Purinda Gunasekara authored Nov 21, 2023
  
  be61a817
- enable cpu instructions on intel macs · 13ba6df5
  Jeffrey Morgan authored Nov 19, 2023
  
  13ba6df5
19 Nov, 2023 2 commits
- Update llm/llama.go · 36a3bbf6
  Jeffrey Morgan authored Nov 18, 2023
  
  36a3bbf6
- fix potentially inaccurate error message · 43a72614
  Bruce MacDonald authored Nov 17, 2023
  
  43a72614
17 Nov, 2023 1 commit
- build intel mac with correct binary and compile flags · 41434a7c
  Jeffrey Morgan authored Nov 16, 2023
  
  41434a7c
10 Nov, 2023 1 commit

JSON mode: add `"format" as an api parameter (#1051) · 5cba29b9

Jeffrey Morgan authored Nov 09, 2023



* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

5cba29b9