Commits · a805e5947e6a41b9978e426f1ebf0eaf6c1c29fe · OpenDAS / ollama

09 Apr, 2024 2 commits

Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) · 1524f323
Blake Mizerany authored Apr 09, 2024

1524f323

build.go: introduce a friendlier way to build Ollama (#3548) · fccf3eec

Blake Mizerany authored Apr 09, 2024

This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.

This script also provides nicer feedback to the user about what is
happening during the build process.

At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).

fccf3eec

04 Jan, 2024 1 commit
- Code shuffle to clean up the llm dir · 77d96da9
  Daniel Hiltgen authored Jan 04, 2024
  
  77d96da9
19 Dec, 2023 3 commits

Adapted rocm support to cgo based llama.cpp · 35934b2e
Daniel Hiltgen authored Nov 29, 2023

35934b2e

Add cgo implementation for llama.cpp · d4cd6957

Daniel Hiltgen authored Nov 13, 2023

Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.

d4cd6957

deprecate ggml · 811b1f03

Bruce MacDonald authored Nov 24, 2023



- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>

811b1f03

24 Nov, 2023 1 commit

Disable CUDA peer access as a workaround for multi-gpu inference bug (#1261) · 12e8c12d

Jongwook Choi authored Nov 24, 2023

When CUDA peer access is enabled, multi-gpu inference will produce
garbage output. This is a known bug of llama.cpp (or nvidia). Until the
upstream bug is fixed, we can disable CUDA peer access temporarily
to ensure correct output.

See #961.

12e8c12d

27 Oct, 2023 1 commit
- restore building runner with `AVX` on by default (#900) · 3a1ed9ff
  Jeffrey Morgan authored Oct 27, 2023
  
  3a1ed9ff
23 Oct, 2023 1 commit
- update default log target · c9167494
  Michael Yang authored Oct 23, 2023
  
  c9167494
06 Oct, 2023 1 commit
- rename server subprocess (#700) · 5d22319a
  Bruce MacDonald authored Oct 06, 2023
```
- this makes it easier to see that the subprocess is associated with ollama
```
  5d22319a
21 Sep, 2023 1 commit
- silence warm up log · 058d0cd0
  Michael Yang authored Sep 21, 2023
  
  058d0cd0
20 Sep, 2023 5 commits
- embed libraries using cmake · 6c6a31a1
  Michael Yang authored Sep 20, 2023
  
  6c6a31a1
- remove libcuda.so · fc6ec356
  Bruce MacDonald authored Sep 20, 2023
  
  fc6ec356
- only package 11.8 runner · 1255bc9b
  Bruce MacDonald authored Sep 20, 2023
  
  1255bc9b
- use cuda_version · b9bb5ca2
  Bruce MacDonald authored Sep 20, 2023
  
  b9bb5ca2
- pack in cuda libs · 4e8be787
  Bruce MacDonald authored Sep 20, 2023
  
  4e8be787
14 Sep, 2023 1 commit

support for packaging in multiple cuda runners (#509) · 2540c918

Bruce MacDonald authored Sep 14, 2023



* enable packaging multiple cuda versions
* use nvcc cuda version if available

---------
Co-authored-by: Michael Yang <mxyng@pm.me>

2540c918

12 Sep, 2023 2 commits
- fix ggml arm64 cuda build (#520) · f59c4d03
  Bruce MacDonald authored Sep 12, 2023
  
  f59c4d03
- first pass at linux gpu support (#454) · f2216370
  Bruce MacDonald authored Sep 12, 2023
```
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  f2216370