Commits · d4cd6957598ba6a3a1bb4e2660ee24b82e2541da · OpenDAS / ollama

19 Dec, 2023 2 commits

Add cgo implementation for llama.cpp · d4cd6957

Daniel Hiltgen authored Nov 13, 2023

Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.

d4cd6957

deprecate ggml · 811b1f03

Bruce MacDonald authored Nov 24, 2023



- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>

811b1f03

18 Dec, 2023 3 commits
- update runner submodule · 6b5bdfa6
  Jeffrey Morgan authored Dec 18, 2023
  
  6b5bdfa6
- update runner submodule to fix hipblas build · c063ee4a
  Jeffrey Morgan authored Dec 18, 2023
  
  c063ee4a
- update runner submodule · b85982eb
  Jeffrey Morgan authored Dec 18, 2023
  
  b85982eb
13 Dec, 2023 1 commit
- Update runner to support mixtral and mixture of experts (MoE) (#1475) · 31f0551d
  Jeffrey Morgan authored Dec 13, 2023
  
  31f0551d
04 Dec, 2023 1 commit
- update for qwen · 6deebf24
  Michael Yang authored Dec 04, 2023
  
  6deebf24
26 Nov, 2023 2 commits
- add back `f16c` instructions on intel mac · 16a90063
  Jeffrey Morgan authored Nov 26, 2023
  
  16a90063
- update submodule commit · 9e4a3164
  Jeffrey Morgan authored Nov 26, 2023
  
  9e4a3164
24 Nov, 2023 2 commits

windows CUDA support (#1262) · 82b9b329

Jing Zhang authored Nov 24, 2023

* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows

82b9b329

Disable CUDA peer access as a workaround for multi-gpu inference bug (#1261) · 12e8c12d

Jongwook Choi authored Nov 24, 2023

When CUDA peer access is enabled, multi-gpu inference will produce
garbage output. This is a known bug of llama.cpp (or nvidia). Until the
upstream bug is fixed, we can disable CUDA peer access temporarily
to ensure correct output.

See #961.

12e8c12d

22 Nov, 2023 1 commit
- consistent cpu instructions on macos and linux · d77dde12
  Jeffrey Morgan authored Nov 22, 2023
  
  d77dde12
21 Nov, 2023 1 commit
- update llama.cpp · a00fac4e
  Michael Yang authored Nov 21, 2023
  
  a00fac4e
20 Nov, 2023 1 commit
- enable cpu instructions on intel macs · 13ba6df5
  Jeffrey Morgan authored Nov 19, 2023
  
  13ba6df5
17 Nov, 2023 1 commit
- build intel mac with correct binary and compile flags · 41434a7c
  Jeffrey Morgan authored Nov 16, 2023
  
  41434a7c
27 Oct, 2023 1 commit
- restore building runner with `AVX` on by default (#900) · 3a1ed9ff
  Jeffrey Morgan authored Oct 27, 2023
  
  3a1ed9ff
24 Oct, 2023 3 commits
- fix metal assertion errors · b0c9cd0f
  Jeffrey Morgan authored Oct 24, 2023
  
  b0c9cd0f
- update submodule commit · 77f61c63
  Jeffrey Morgan authored Oct 24, 2023
  
  77f61c63
- update submodule commit · f3604534
  Jeffrey Morgan authored Oct 23, 2023
  
  f3604534
23 Oct, 2023 2 commits
- bump submodules · 0c7a00a2
  Michael Yang authored Oct 23, 2023
```
pin to 9e70cc03229df19ca2d28ce23cc817198f897278 for now since
438c2ca83045a00ef244093d27e9ed41a8cb4ea9 is breaking
```
  0c7a00a2
- update default log target · c9167494
  Michael Yang authored Oct 23, 2023
  
  c9167494
17 Oct, 2023 1 commit
- Update llama.cpp gguf to latest (#710) · f3648fd2
  Bruce MacDonald authored Oct 17, 2023
  
  f3648fd2
06 Oct, 2023 2 commits
- llm: fix build on `amd64` · ab066829
  Jeffrey Morgan authored Oct 06, 2023
  
  ab066829
- rename server subprocess (#700) · 5d22319a
  Bruce MacDonald authored Oct 06, 2023
```
- this makes it easier to see that the subprocess is associated with ollama
```
  5d22319a
21 Sep, 2023 2 commits
- silence warm up log · 058d0cd0
  Michael Yang authored Sep 21, 2023
  
  058d0cd0
- update submodule (#567) · ee1c994d
  Michael Yang authored Sep 21, 2023
  
  ee1c994d
20 Sep, 2023 6 commits
- rename generate.go · a9ed7cc6
  Michael Yang authored Sep 20, 2023
  
  a9ed7cc6
- embed libraries using cmake · 6c6a31a1
  Michael Yang authored Sep 20, 2023
  
  6c6a31a1
- remove libcuda.so · fc6ec356
  Bruce MacDonald authored Sep 20, 2023
  
  fc6ec356
- only package 11.8 runner · 1255bc9b
  Bruce MacDonald authored Sep 20, 2023
  
  1255bc9b
- use cuda_version · b9bb5ca2
  Bruce MacDonald authored Sep 20, 2023
  
  b9bb5ca2
- pack in cuda libs · 4e8be787
  Bruce MacDonald authored Sep 20, 2023
  
  4e8be787
18 Sep, 2023 1 commit

subprocess improvements (#524) · 66003e1d

Bruce MacDonald authored Sep 18, 2023

* subprocess improvements

- increase start-up timeout
- when runner fails to start fail rather than timing out
- try runners in order rather than choosing 1 runner
- embed metal runner in metal dir rather than gpu
- refactor logging and error messages

* Update llama.go

* Update llama.go

* simplify by using glob

66003e1d

14 Sep, 2023 1 commit

support for packaging in multiple cuda runners (#509) · 2540c918

Bruce MacDonald authored Sep 14, 2023



* enable packaging multiple cuda versions
* use nvcc cuda version if available

---------
Co-authored-by: Michael Yang <mxyng@pm.me>

2540c918

12 Sep, 2023 2 commits
- fix ggml arm64 cuda build (#520) · f59c4d03
  Bruce MacDonald authored Sep 12, 2023
  
  f59c4d03
- first pass at linux gpu support (#454) · f2216370
  Bruce MacDonald authored Sep 12, 2023
```
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  f2216370
07 Sep, 2023 1 commit
- GGUF support (#441) · 09dd2aef
  Bruce MacDonald authored Sep 07, 2023
  
  09dd2aef
06 Sep, 2023 2 commits
- set minimum `CMAKE_OSX_DEPLOYMENT_TARGET` to 11.0 · 61dda6a5
  Jeffrey Morgan authored Sep 06, 2023
  
  61dda6a5
- macos `amd64` compatibility fixes · 213ffdb5
  Jeffrey Morgan authored Sep 05, 2023
  
  213ffdb5
05 Sep, 2023 1 commit
- metal: add missing barriers for mul-mat (#469) · d18282bf
  Bruce MacDonald authored Sep 05, 2023
  
  d18282bf