Commits · 3ebd6a83fcfbdecf3ccbae13ebaf4435c853465d · OpenDAS / ollama

25 Jan, 2024 7 commits
- update submodule to `cd4fddb29f81d6a1f6d51a0c016bc6b486d68def` · 3ebd6a83
  Jeffrey Morgan authored Jan 25, 2024
  
  3ebd6a83
- Fix clearing kv cache between requests with the same prompt (#2186) · a64570dc
  Jeffrey Morgan authored Jan 25, 2024
```
* Fix clearing kv cache between requests with the same prompt

* fix powershell script
```
  a64570dc
- Save and load sessions (#2063) · 7c40a678
  Patrick Devine authored Jan 25, 2024
  
  7c40a678
- Merge pull request #2181 from ollama/mxyng/stub-lint · e64b5b07
  Michael Yang authored Jan 25, 2024
```
stub generate outputs for lint
```
  e64b5b07
- Merge pull request #2175 from ollama/mxyng/refactor-tensor-read · 9e1e295c
  Michael Yang authored Jan 25, 2024
```
refactor tensor read
```
  9e1e295c
- Update README.md · a643823f
  Jeffrey Morgan authored Jan 24, 2024
  
  a643823f
- stub generate outputs for lint · 8e5d359a
  Michael Yang authored Jan 24, 2024
  
  8e5d359a
24 Jan, 2024 5 commits
- Merge pull request #2174 from dhiltgen/rocm_real_gpus · a170888d
  Daniel Hiltgen authored Jan 24, 2024
```
More logging for gpu management
```
  a170888d
- refactor tensor read · cd22855e
  Michael Yang authored Jan 24, 2024
  
  cd22855e
- More logging for gpu management · 013fd071
  Daniel Hiltgen authored Jan 24, 2024
```
Fix an ordering glitch of dlerr/dlclose and add more logging to help
root cause some crashes users are hitting. This also refines the
function pointer names to use the underlying function names instead
of simplified names for readability.
```
  013fd071
- Merge pull request #2162 from dhiltgen/rocm_real_gpus · f63dc2db
  Daniel Hiltgen authored Jan 23, 2024
```
Report more information about GPUs in verbose mode
```
  f63dc2db
- Update README.md · eaa5a396
  Jeffrey Morgan authored Jan 23, 2024
  
  eaa5a396
23 Jan, 2024 8 commits
- Update README.md · 8ed22f5d
  Jeffrey Morgan authored Jan 23, 2024
  
  8ed22f5d
- Report more information about GPUs in verbose mode · 987c16b2
  Daniel Hiltgen authored Jan 22, 2024
```
This adds additional calls to both CUDA and ROCm management libraries to
discover additional attributes about the GPU(s) detected in the system, and
wires up runtime verbosity selection.  When users hit problems with GPUs we can
ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the results.
```
  987c16b2
- Update README.md · 950f636d
  Jeffrey Morgan authored Jan 23, 2024
  
  950f636d
- Load all layers on `arm64` macOS if model is small enough (#2149) · 4458efb7
  Jeffrey Morgan authored Jan 22, 2024
  
  4458efb7
- Merge pull request #2150 from dhiltgen/default_version · ceea5994
  Daniel Hiltgen authored Jan 22, 2024
```
Set a default version using git describe
```
  ceea5994
- Set a default version using git describe · 3005ec74
  Daniel Hiltgen authored Jan 22, 2024
```
If a VERSION is not specified, this will generate a version string that
represents the state of the repo.  For example `0.1.21-12-gffaf52e1-dirty`
representing 12 commits away from 0.1.21 tag, on commit gffaf52e1
and the tree is dirty.
```
  3005ec74
- Merge pull request #2148 from dhiltgen/intel_mac · 0759d899
  Daniel Hiltgen authored Jan 22, 2024
```
Refine Accelerate usage on mac
```
  0759d899
- Refine Accelerate usage on mac · 0f5b8433
  Daniel Hiltgen authored Jan 22, 2024
```
For old macs, accelerate seems to cause crashes, but for
AVX2 capable macs, it does not.
```
  0f5b8433
22 Jan, 2024 12 commits
- update submodule to `011e8ec577fd135cbc02993d3ea9840c516d6a1c` · ffaf52e1
  Jeffrey Morgan authored Jan 22, 2024
  
  ffaf52e1
- Merge pull request #2144 from jmorganca/mxyng/update-faq · 940b10b0
  Michael Yang authored Jan 22, 2024
```
faq: update to use launchctl setenv
```
  940b10b0
- Merge pull request #2143 from dhiltgen/llm_verbosity · 3bc28736
  Daniel Hiltgen authored Jan 22, 2024
```
Refine debug logging for llm
```
  3bc28736
- faq: update to use launchctl setenv · 93a75626
  Michael Yang authored Jan 22, 2024
  
  93a75626
- Merge pull request #2142 from dhiltgen/debug_on_fail · a0a829bf
  Daniel Hiltgen authored Jan 22, 2024
```
Debug logging on init failure
```
  a0a829bf
- Refine debug logging for llm · 730dcfcc
  Daniel Hiltgen authored Jan 22, 2024
```
This wires up logging in llama.cpp to always go to stderr, and also
turns up logging if OLLAMA_DEBUG is set.
```
  730dcfcc
- Debug logging on init failure · 27a2d5af
  Daniel Hiltgen authored Jan 22, 2024
  
  27a2d5af
- update submodule to `6f9939d` (#2115) · 5f81a33f
  Jeffrey Morgan authored Jan 22, 2024
  
  5f81a33f
- Merge pull request #2102 from jmorganca/mxyng/fix-create-override · 6225fde0
  Michael Yang authored Jan 22, 2024
```
fix: remove overwritten model layers
```
  6225fde0
- readline: drop not use min function (#2134) · 06918456
  Meng Zhuo authored Jan 23, 2024
  
  06918456
- Merge pull request #2130 from dhiltgen/more_faster · 5576bb23
  Daniel Hiltgen authored Jan 21, 2024
```
Make CPU builds parallel and customizable AMD GPUs
```
  5576bb23
- Merge pull request #2131 from dhiltgen/probe_cards_at_init · 27388377
  Daniel Hiltgen authored Jan 21, 2024
```
Probe GPUs before backend init
```
  27388377
21 Jan, 2024 5 commits

Probe GPUs before backend init · ec376453

Daniel Hiltgen authored Jan 21, 2024

Detect potential error scenarios so we can fallback to CPU mode without
hitting asserts.

ec376453

Make CPU builds parallel and customizable AMD GPUs · df54c723

Daniel Hiltgen authored Jan 21, 2024

The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.

df54c723

Merge pull request #2127 from dhiltgen/rocm_container · fa8c990e
Daniel Hiltgen authored Jan 21, 2024
```
Combine the 2 Dockerfiles and add ROCm
```
fa8c990e

Combine the 2 Dockerfiles and add ROCm · da72235e

Daniel Hiltgen authored Jan 21, 2024

This renames Dockerfile.build to Dockerfile, and adds some new stages
to support 2 modes of building - the build_linux.sh script uses
intermediate stages to extract the artifacts for ./dist, and the default
build generates a container image usable by both cuda and rocm cards.
This required transitioniing the x86 base to the rocm image to avoid
layer bloat.

da72235e

Unlock mutex when failing to load model (#2117) · 89c4aee2
Jeffrey Morgan authored Jan 20, 2024

89c4aee2

20 Jan, 2024 2 commits
- increase minimum overhead to 1024MiB (#2114) · f32ea81b
  Jeffrey Morgan authored Jan 20, 2024
  
  f32ea81b
- sign dylibs on macOS (#2101) · 4c54f0dd
  Jeffrey Morgan authored Jan 19, 2024
  
  4c54f0dd
19 Jan, 2024 1 commit

fix: remove overwritten model layers · c08dfaa2

Michael Yang authored Jan 19, 2024

if create overrides a manifest, first add the older manifest's layers to
the delete map so they can be cleaned up

c08dfaa2