Commits · c942e4a07b91dc6b78bb245241ea514b752e3d4d · OpenDAS / ollama

"vscode:/vscode.git/clone" did not exist on "fd4792ec56965a9c8564c3d88212c29a0378583d"

17 Apr, 2024 2 commits
- Fixed startup sequence to report model loading · c942e4a0
  ManniX-ITA authored Apr 17, 2024
  
  c942e4a0
- Streamlined WaitUntilRunning · bd54b082
  ManniX-ITA authored Apr 17, 2024
  
  bd54b082
16 Apr, 2024 5 commits
- scale graph based on gpu count · 26df6747
  Michael Yang authored Apr 16, 2024
  
  26df6747
- Support unicode characters in model path (#3681) · 7c9792a6
  Jeffrey Morgan authored Apr 16, 2024
```
* parse wide argv characters on windows

* cleanup

* move cleanup to end of `main`
```
  7c9792a6
- darwin: no partial offloading if required memory greater than system · 41a272de
  Michael Yang authored Apr 16, 2024
  
  41a272de
- update llama.cpp submodule to `7593639` (#3665) · f3357222
  Jeffrey Morgan authored Apr 15, 2024
  
  f3357222
- fix padding in decode · 969238b1
  Michael Yang authored Apr 15, 2024
```
TODO: update padding() to _only_ returning the padding
```
  969238b1
15 Apr, 2024 2 commits
- Add llama2 / torch models for `ollama create` (#3607) · 9f8691c6
  Patrick Devine authored Apr 15, 2024
  
  9f8691c6
- Terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading (#3653) · a0b8a32e
  Jeffrey Morgan authored Apr 15, 2024
```
* terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading

* use `unload` in signal handler
```
  a0b8a32e
13 Apr, 2024 1 commit
- update llama.cpp submodule to `4bd0f93` (#3627) · 309aef7f
  Jeffrey Morgan authored Apr 13, 2024
  
  309aef7f
11 Apr, 2024 1 commit
- mixtral mem · 3397eff0
  Michael Yang authored Apr 11, 2024
  
  3397eff0
10 Apr, 2024 2 commits
- partial offloading · 7e33a017
  Michael Yang authored Apr 05, 2024
  
  7e33a017
- refactor tensor query · 8b2c1006
  Michael Yang authored Apr 03, 2024
  
  8b2c1006
09 Apr, 2024 4 commits

Handle very slow model loads · c5ff443b
Daniel Hiltgen authored Apr 09, 2024
```
During testing, we're seeing some models take over 3 minutes.
```
c5ff443b
Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) · 1524f323
Blake Mizerany authored Apr 09, 2024

1524f323

build.go: introduce a friendlier way to build Ollama (#3548) · fccf3eec

Blake Mizerany authored Apr 09, 2024

This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.

This script also provides nicer feedback to the user about what is
happening during the build process.

At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).

fccf3eec

update llama.cpp submodule to `1b67731` (#3561) · 5ec12cec
Jeffrey Morgan authored Apr 09, 2024

5ec12cec

08 Apr, 2024 1 commit
- cgo quantize · 9502e566
  Michael Yang authored Apr 05, 2024
  
  9502e566
07 Apr, 2024 1 commit
- update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to... · 63efa075
  Jeffrey Morgan authored Apr 07, 2024
```
update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528)
```
  63efa075
06 Apr, 2024 1 commit
- no rope parameters · be517e49
  Michael Yang authored Apr 05, 2024
  
  be517e49
04 Apr, 2024 3 commits
- add command-r graph estimate · 01f77ae2
  Michael Yang authored Apr 04, 2024
  
  01f77ae2
- Fail fast if mingw missing on windows · 36bd9677
  Daniel Hiltgen authored Apr 04, 2024
  
  36bd9677
- fix dll compress in windows building · 4de01267
  mofanke authored Apr 04, 2024
  
  4de01267
03 Apr, 2024 3 commits
- Fix CI release glitches · e4a7e5b2
  Daniel Hiltgen authored Apr 03, 2024
```
The subprocess change moved the build directory
arm64 builds weren't setting cross-compilation flags when building on x86
```
  e4a7e5b2
- update graph size estimate · 12e923e1
  Michael Yang authored Apr 02, 2024
  
  12e923e1
- Fix macOS builds on older SDKs (#3467) · cd135317
  Jeffrey Morgan authored Apr 03, 2024
  
  cd135317
02 Apr, 2024 4 commits
- Revert options as a ref in the server · 6589eb8a
  Daniel Hiltgen authored Apr 02, 2024
  
  6589eb8a
- default head_kv to 1 · 90f071c6
  Michael Yang authored Apr 02, 2024
  
  90f071c6
- fix metal gpu · 80163ebc
  Michael Yang authored Apr 02, 2024
  
  80163ebc
- Bump to b2581 · 0035e31a
  Daniel Hiltgen authored Mar 25, 2024
  
  0035e31a
01 Apr, 2024 4 commits

Apply 01-cache.diff · 0a0e9f3e
Daniel Hiltgen authored Mar 19, 2024

0a0e9f3e

Switch back to subprocessing for llama.cpp · 58d95cc9

Daniel Hiltgen authored Mar 14, 2024

This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.

58d95cc9

update memory calcualtions · 91b3e4d2
Michael Yang authored Mar 18, 2024
```
count each layer independently when deciding gpu offloading
```
91b3e4d2
refactor model parsing · d338d704
Michael Yang authored Mar 13, 2024

d338d704

29 Mar, 2024 1 commit
- Add gemma safetensors conversion (#3250) · 5a5efee4
  Patrick Devine authored Mar 28, 2024
```
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  5a5efee4
26 Mar, 2024 3 commits
- add license in file header for vendored llama.cpp code (#3351) · f5ca7f8c
  Jeffrey Morgan authored Mar 26, 2024
  
  f5ca7f8c
- remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found (#3350) · 856b8ec1
  Jeffrey Morgan authored Mar 26, 2024
  
  856b8ec1
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b
25 Mar, 2024 2 commits
- Bump llama.cpp to b2527 · 8091ef2e
  Daniel Hiltgen authored Mar 25, 2024
  
  8091ef2e
- add support for libcudart.so for CUDA devices (adds Jetson support) · dfc6721b
  Jeremy authored Mar 25, 2024
  
  dfc6721b