Commits · 6920964b87971c8201097130bfdedbf56aaa13a7 · OpenDAS / ollama

"tests/vscode:/vscode.git/clone" did not exist on "b5e1facc85219770b6e85a9cfb2ec554167ccedc"

14 Feb, 2024 2 commits
- Revert "bump submodule to `6c00a06` (#2479)" · 6920964b
  Jeffrey Morgan authored Feb 13, 2024
```
This reverts commit 2f9ed52b.
```
  6920964b
- bump submodule to `6c00a06` (#2479) · 2f9ed52b
  Jeffrey Morgan authored Feb 13, 2024
  
  2f9ed52b
12 Feb, 2024 3 commits
- update submodule to `099afc6` (#2468) · f76ca04f
  Jeffrey Morgan authored Feb 12, 2024
  
  f76ca04f
- Detect AMD GPU info via sysfs and block old cards · 6d84f075
  Daniel Hiltgen authored Feb 11, 2024
```
This wires up some new logic to start using sysfs to discover AMD GPU
information and detects old cards we can't yet support so we can fallback to CPU mode.
```
  6d84f075
- patch: always add token to cache_tokens (#2459) · 26b13fc3
  Jeffrey Morgan authored Feb 12, 2024
  
  26b13fc3
09 Feb, 2024 1 commit

Daniel Hiltgen authored Feb 08, 2024

Make sure that when a shutdown signal comes, we shutdown quickly instead
of waiting for a potentially long exchange to wrap up.

66807615

08 Feb, 2024 1 commit

Ensure the libraries are present · a1dfab43

Daniel Hiltgen authored Feb 07, 2024

When we store our libraries in a temp dir, a reaper might clean
them when we are idle, so make sure to check for them before
we reload.

a1dfab43

06 Feb, 2024 1 commit
- Bump llama.cpp to b2081 · de76b95d
  Daniel Hiltgen authored Feb 06, 2024
  
  de76b95d
02 Feb, 2024 1 commit

Harden generate patching model · e1f50377

Daniel Hiltgen authored Feb 01, 2024

Only apply patches if we have any, and make sure to cleanup
every file we patched at the end to leave the tree clean

e1f50377

01 Feb, 2024 2 commits
- use `llm.ImageData` · f11bf074
  Jeffrey Morgan authored Jan 31, 2024
  
  f11bf074
- trim images · 8450bf66
  Michael Yang authored Jan 31, 2024
  
  8450bf66
31 Jan, 2024 1 commit

Bump llama.cpp to b1999 · 72b12c3b

Daniel Hiltgen authored Jan 29, 2024

This requires an upstream change to support graceful termination,
carried as a patch.

72b12c3b

29 Jan, 2024 1 commit
- remove unknown `CPPFLAGS` option · 2e06ed01
  Jeffrey Morgan authored Jan 28, 2024
  
  2e06ed01
25 Jan, 2024 3 commits
- update submodule to `cd4fddb29f81d6a1f6d51a0c016bc6b486d68def` · 3ebd6a83
  Jeffrey Morgan authored Jan 25, 2024
  
  3ebd6a83
- Fix clearing kv cache between requests with the same prompt (#2186) · a64570dc
  Jeffrey Morgan authored Jan 25, 2024
```
* Fix clearing kv cache between requests with the same prompt

* fix powershell script
```
  a64570dc
- Update gen_linux.sh to find libcudart in separate directory · a4564232
  mraiser authored Jan 25, 2024
  
  a4564232
24 Jan, 2024 1 commit
- refactor tensor read · cd22855e
  Michael Yang authored Jan 24, 2024
  
  cd22855e
23 Jan, 2024 2 commits
- Load all layers on `arm64` macOS if model is small enough (#2149) · 4458efb7
  Jeffrey Morgan authored Jan 22, 2024
  
  4458efb7
- Refine Accelerate usage on mac · 0f5b8433
  Daniel Hiltgen authored Jan 22, 2024
```
For old macs, accelerate seems to cause crashes, but for
AVX2 capable macs, it does not.
```
  0f5b8433
22 Jan, 2024 4 commits
- update submodule to `011e8ec577fd135cbc02993d3ea9840c516d6a1c` · ffaf52e1
  Jeffrey Morgan authored Jan 22, 2024
  
  ffaf52e1
- Refine debug logging for llm · 730dcfcc
  Daniel Hiltgen authored Jan 22, 2024
```
This wires up logging in llama.cpp to always go to stderr, and also
turns up logging if OLLAMA_DEBUG is set.
```
  730dcfcc
- Debug logging on init failure · 27a2d5af
  Daniel Hiltgen authored Jan 22, 2024
  
  27a2d5af
- update submodule to `6f9939d` (#2115) · 5f81a33f
  Jeffrey Morgan authored Jan 22, 2024
  
  5f81a33f
21 Jan, 2024 3 commits
- Probe GPUs before backend init · ec376453
  Daniel Hiltgen authored Jan 21, 2024
```
Detect potential error scenarios so we can fallback to CPU mode without
hitting asserts.
```
  ec376453
- Make CPU builds parallel and customizable AMD GPUs · df54c723
  Daniel Hiltgen authored Jan 21, 2024
```
The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.
```
  df54c723
- Unlock mutex when failing to load model (#2117) · 89c4aee2
  Jeffrey Morgan authored Jan 20, 2024
  
  89c4aee2
20 Jan, 2024 3 commits
- Add compute capability 5.0, 7.5, and 8.0 · a447a083
  Daniel Hiltgen authored Jan 20, 2024
  
  a447a083
- Add support for CUDA 5.2 cards · 681a9149
  Daniel Hiltgen authored Jan 20, 2024
  
  681a9149
- sign dylibs on macOS (#2101) · 4c54f0dd
  Jeffrey Morgan authored Jan 19, 2024
  
  4c54f0dd
19 Jan, 2024 4 commits
- Switch to local dlopen symbols · 6a042438
  Daniel Hiltgen authored Jan 19, 2024
  
  6a042438
- use `gzip` for runner embedding (#2067) · dc88cc39
  Jeffrey Morgan authored Jan 19, 2024
  
  dc88cc39
- Restore dyn_ext_server.c since RTLD_DEEPBIND has been removed · 344342ab
  Self Denial authored Jan 18, 2024
  
  344342ab
- Fix CPU-only build under Android Termux enviornment. · eb76f3e3
  Self Denial authored Jan 15, 2024
```
Update gpu.go initGPUHandles() to declare gpuHandles variable before
reading it. This resolves an "invalid memory address or nil pointer
dereference" error.

Update dyn_ext_server.c to avoid setting the RTLD_DEEPBIND flag under
__TERMUX__ (Android).
```
  eb76f3e3
18 Jan, 2024 1 commit
- Mechanical switch from log to slog · fedd705a
  Daniel Hiltgen authored Jan 18, 2024
```
A few obvious levels were adjusted, but generally everything mapped to "info" level.
```
  fedd705a
17 Jan, 2024 1 commit
- Add multiple CPU variants for Intel Mac · 1b249748
  Daniel Hiltgen authored Jan 12, 2024
```
This also refines the build process for the ext_server build.
```
  1b249748
16 Jan, 2024 2 commits

Bump llama.cpp to b1842 and add new cuda lib dep · 795674dd

Daniel Hiltgen authored Jan 10, 2024

Upstream llama.cpp has added a new dependency with the
NVIDIA CUDA Driver Libraries (libcuda.so) which is part of the
driver distribution, not the general cuda libraries, and is not
available as an archive, so we can not statically link it.  This may
introduce some additional compatibility challenges which we'll
need to keep an eye on.

795674dd

do not cache prompt (#2018) · a897e833
Bruce MacDonald authored Jan 16, 2024
```
- prompt cache causes inferance to hang after some time
```
a897e833

14 Jan, 2024 3 commits
- Fix typo in arm mac arch script · 3ca5f69c
  Daniel Hiltgen authored Jan 14, 2024
  
  3ca5f69c
- Let gpu.go and gen_linux.sh also find CUDA on Arch Linux · f4bf1d51
  Alexander F. Rødseth authored Jan 14, 2024
  
  f4bf1d51
- Disable `mmap` with lora layers (#1985) · 557110d0
  Jeffrey Morgan authored Jan 13, 2024
  
  557110d0