Commits · f5ca7f8c8e77ae605c69da45f2be16f0f0e72ca3 · OpenDAS / ollama

26 Mar, 2024 3 commits
- add license in file header for vendored llama.cpp code (#3351) · f5ca7f8c
  Jeffrey Morgan authored Mar 26, 2024
  
  f5ca7f8c
- remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found (#3350) · 856b8ec1
  Jeffrey Morgan authored Mar 26, 2024
  
  856b8ec1
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b
25 Mar, 2024 2 commits
- Bump llama.cpp to b2527 · 8091ef2e
  Daniel Hiltgen authored Mar 25, 2024
  
  8091ef2e
- add support for libcudart.so for CUDA devices (adds Jetson support) · dfc6721b
  Jeremy authored Mar 25, 2024
  
  dfc6721b
24 Mar, 2024 1 commit
- llm: prevent race appending to slice (#3320) · acfa2b94
  Blake Mizerany authored Mar 24, 2024
  
  acfa2b94
23 Mar, 2024 2 commits
- Bump llama.cpp to b2510 · 3e30c75f
  Daniel Hiltgen authored Mar 23, 2024
  
  3e30c75f
- Bump llama.cpp to b2474 · 43799532
  Daniel Hiltgen authored Mar 23, 2024
```
The release just before ggml-cuda.cu refactoring
```
  43799532
20 Mar, 2024 1 commit

Daniel Hiltgen authored Mar 13, 2024

If expanding the runners fails, don't leave a corrupt/incomplete payloads dir
We now write a pid file out to the tmpdir, which allows us to scan for stale tmpdirs
and remove this as long as there isn't still a process running.

74788b48

18 Mar, 2024 1 commit
- dyn global · 3c4ad0ec
  Michael Yang authored Mar 15, 2024
  
  3c4ad0ec
16 Mar, 2024 1 commit
- llama: remove server static assets (#3174) · e95ffc74
  Jeffrey Morgan authored Mar 15, 2024
  
  e95ffc74
15 Mar, 2024 3 commits
- Add Radeon gfx940-942 GPU support · d4c10df2
  Daniel Hiltgen authored Mar 15, 2024
  
  d4c10df2
- Wire up more complete CI for releases · 540f4af4
  Daniel Hiltgen authored Mar 07, 2024
```
Flesh out our github actions CI so we can build official releaes.
```
  540f4af4
- llm,readline: use errors.Is instead of simple == check (#3161) · 6ce37e4d
  Blake Mizerany authored Mar 15, 2024
```
This fixes some brittle, simple equality checks to use errors.Is. Since
go1.13, errors.Is is the idiomatic way to check for errors.
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
```
  6ce37e4d
14 Mar, 2024 1 commit
- fix: clip memory leak · 291c6638
  Michael Yang authored Mar 14, 2024
  
  291c6638
13 Mar, 2024 2 commits
- restore locale patch (#3091) · e72c567c
  Jeffrey Morgan authored Mar 12, 2024
  
  e72c567c
- token repeat limit for prediction requests (#3080) · 3e226112
  Bruce MacDonald authored Mar 12, 2024
  
  3e226112
12 Mar, 2024 5 commits
- warn when json format is expected but not mentioned in prompt (#3081) · 2f804068
  Bruce MacDonald authored Mar 12, 2024
  
  2f804068
- Adapt our build for imported server.cpp · 85129d3a
  Daniel Hiltgen authored Mar 12, 2024
  
  85129d3a
- Import server.cpp as of b2356 · 9ac6440d
  Daniel Hiltgen authored Mar 12, 2024
  
  9ac6440d
- refactor readseeker · 00852979
  Michael Yang authored Mar 09, 2024
  
  00852979
- chore: fix typo (#3073) · 53c107e2
  racerole authored Mar 13, 2024
```
Signed-off-by: racerole <jiangyifeng@outlook.com>
```
  53c107e2
11 Mar, 2024 3 commits
- relay load model errors to the client (#3065) · b80661e8
  Bruce MacDonald authored Mar 11, 2024
  
  b80661e8
- update llama.cpp submodule to `ceca1ae` (#3064) · 369eda65
  Jeffrey Morgan authored Mar 11, 2024
  
  369eda65
- Avoid rocm runner and dependency clash · bc13da2b
  Daniel Hiltgen authored Mar 11, 2024
```
Putting the rocm symlink next to the runners is risky.  This moves
the payloads into a subdir to avoid potential clashes.
```
  bc13da2b
10 Mar, 2024 4 commits
- fix `03-locale.diff` · 41b00b98
  Jeffrey Morgan authored Mar 10, 2024
  
  41b00b98
- Harden for deps file being empty (or short) · 3dc1bb6a
  Daniel Hiltgen authored Mar 10, 2024
  
  3dc1bb6a
- patch: use default locale in wpm tokenizer (#3034) · 908005d9
  Jeffrey Morgan authored Mar 09, 2024
  
  908005d9
- add `bundle_metal` and `cleanup_metal` funtions to `gen_darwin.sh` · e11668aa
  Jeffrey Morgan authored Mar 09, 2024
  
  e11668aa
09 Mar, 2024 4 commits
- update llama.cpp submodule to `77d1ac7` (#3030) · 1ffb1e28
  Jeffrey Morgan authored Mar 09, 2024
  
  1ffb1e28
- disable gpu for certain model architectures and fix divide-by-zero on memory estimation · f9cd55c7
  Jeffrey Morgan authored Mar 09, 2024
  
  f9cd55c7
- Finish unwinding idempotent payload logic · 4a5c9b80
  Daniel Hiltgen authored Mar 08, 2024
```
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent.  This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
```
  4a5c9b80
- update llama.cpp submodule to `c2101a2` (#3020) · efe5617b
  Jeffrey Morgan authored Mar 09, 2024
  
  efe5617b
08 Mar, 2024 2 commits
- decode ggla · 76bdebba
  Michael Yang authored Mar 08, 2024
  
  76bdebba
- update llama.cpp submodule to `6cdabe6` (#2999) · 0e4669b0
  Jeffrey Morgan authored Mar 08, 2024
  
  0e4669b0
07 Mar, 2024 3 commits

Revamp ROCm support · 6c5ccb11

Daniel Hiltgen authored Feb 15, 2024

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.

6c5ccb11

fix some typos (#2973) · 23ebe8fe
John authored Mar 07, 2024
```
Signed-off-by: hishope <csqiye@126.com>
```
23ebe8fe
Convert Safetensors to an Ollama model (#2824) · 2c017ca4
Patrick Devine authored Mar 06, 2024

2c017ca4

01 Mar, 2024 1 commit
- update llama.cpp submodule to `c29af7e` (#2868) · 21347e1e
  Jeffrey Morgan authored Mar 01, 2024
  
  21347e1e
29 Feb, 2024 1 commit
- bump submodule to `87c91c07663b707e831c59ec373b5e665ff9d64a` (#2828) · cbf4970e
  Jeffrey Morgan authored Feb 29, 2024
  
  cbf4970e