Commits · 155734e09ae066efe26bca19d015ead10ea9d99b · OpenDAS / ollama

21 Nov, 2024 17 commits
- readme: add community integration py-gpt (#6503) · 155734e0
  Marcin Szczygliński authored Nov 21, 2024
  
  155734e0
- readme: add Promptery to community integrations (#7093) · 883d80e0
  Michael authored Nov 21, 2024
  
  883d80e0
- readme: add node-red-contrib-ollama to community integrations (#4648) · e4c9f75b
  Jakub Burkiewicz authored Nov 21, 2024
  
  e4c9f75b
- readme: add ollama grid search, a community project (#4301) · f5ec7cc8
  Dezoito authored Nov 21, 2024
  
  f5ec7cc8
- readme: Add LLPhant to community integrations (#5679) · 811bafba
  Franco Lombardo authored Nov 21, 2024
  
  811bafba
- readme: add autogpt integration to list of community integrations (#6459) · 431075fc
  Aarushi authored Nov 21, 2024
  
  431075fc
- readme: add community contribution to readme ollama-kis (#5575) · c4f27225
  Kevin Brake authored Nov 21, 2024
  
  c4f27225
- readme: Add tkinter-based client to community based integrations (#5412) · b7aa5ee0
  chyok authored Nov 21, 2024
  
  b7aa5ee0
- readme: add Shinkai Desktop to community integrations (#4877) · 3f87f717
  Nico authored Nov 21, 2024
  
  3f87f717
- readme: add OpenGPA to community integrations (#5497) · 20623cec
  Laurent Eschenauer authored Nov 21, 2024
  
  20623cec
- readme: add Haverscript to community integrations (#6945) · 0e5f31a8
  Andy Gill authored Nov 21, 2024
```
Haverscript uses classical functional programming techniques to provide a composable interface for interacting with ollama-hosted LLMs.
```
  0e5f31a8
- readme: Terminal app bb7 to community integrations (#7064) · 7e920917
  drunkwcodes authored Nov 21, 2024
  
  7e920917
- readme: update AMD ROCm links (#7213) · 1a742f54
  boessu authored Nov 21, 2024
  
  1a742f54
- readme: flutter-based chat app to community integrations (#7221) · 6a89dcf8
  奶茶叔叔 authored Nov 21, 2024
  
  6a89dcf8
- readme: orbiton to community integrations (#7770) · c5e238e8
  Alexander F. Rødseth authored Nov 21, 2024
  
  c5e238e8
- app: typo in wintray messages const (#7705) · fce30f40
  Nikita Ganzikov authored Nov 21, 2024
  
  fce30f40
- docs: Link to AMD guide on multi-GPU guidance (#7744) · d8632982
  Daniel Hiltgen authored Nov 20, 2024
  
  d8632982
20 Nov, 2024 14 commits

runner.go: Truncate inputs that exceed context rather than shifting · c4b34f2a

Jesse Gross authored Nov 20, 2024

Previous versions of the runner would truncate inputs to the context
window before beginning processing. The main processing loop relied
on this behavior if the context needed to be shifted later (due to
token generation). If truncation did not occur then invariants
would be broken, causing crashes or infinite loops.

Later versions attempted to fix these bugs and make the logic less
subtle so that all inputs could be handled. Truncation was removed
to make things consistent.

However, truncation is much faster than processing and shifting, so
removing it caused performance problems when the input vastly exceeded
the context size. This restores the input truncation as a performance
optimization while keeping the more robust processing logic.

Fixes #7762

c4b34f2a

runner.go: Don't add inputs to cache view until actually processed · c3ff9164

Jesse Gross authored Nov 19, 2024

We need to track which tokens are in the cache ourselves. We currently
add tokens to the cache tracker when we add them to batch but they are
not actually in the cache until we call Decode. This can cause
confusion when we are shifting the cache.

Avoids "could not find a KV slot for the batch" issues.

Bug #7545

c3ff9164

runner.go: Hard fail on errors rather than potentially infinite looping · 3fc1dc0e

Jesse Gross authored Nov 19, 2024

We try to recover from errors by dropping the tokens that caused the
problem and re-trying. However, dropping the tokens is not correct
and continuing often leads to infinite loops. To avoid, this we
end the sequence if such a condition is detected, which is also
surprising.

At this point, it is better to just report the error. This will make
it easier to find problems and the alternatives are perhaps even more
surprising to users.

This is not a very satisfactory solution either - we should isolate
the error and return it to the user without killing the whole process.
However, this is an incremental step and consistent with most other
failures (which either manifest as abort() or panic).

3fc1dc0e

runner.go: Retry decoding after defragmentation if needed · 7121dfa3

Jesse Gross authored Nov 19, 2024

Fragmentation of the KV cache can occur due to cache shifting or
different sequences getting processed. Decode uses a heuristic to
decide if it should defrag. However, this heuristic isn't 100%
accurate, so decoding can sometimes fail by surprise.

For these cases, if decode indicates that there is no KV cache space,
we should defrag and then try again.

7121dfa3

runner.go: Use correct index when retrieving embedding results · 5f68fcab

Jesse Gross authored Nov 19, 2024

This doesn't have any impact currently because NUM_PARALLEL is forced
to 1 for embeddings, so both indicies will always be 0.

5f68fcab

readme: add llm-axe to community integrations (#5931) · ecf41eed
Emir Sahin authored Nov 20, 2024

ecf41eed
readme: add a swift community integration (#7383) · b8c66d33
Marcus Ziadé authored Nov 20, 2024

b8c66d33
readme: add vibe app to community integrations (#7607) · 303f4bc7
thewh1teagle authored Nov 20, 2024

303f4bc7
readme: add opentalkgpt to community integrations (#7707) · d2a25206
Adarsh Mishra authored Nov 21, 2024

d2a25206
docs: fix minor typo in import.md (#7764) · 2f0a8c87
rohitanshu authored Nov 20, 2024
```
change 'containg' to 'containing'
```
2f0a8c87
readme: add Abbey to community integrations (#7746) · bfd30f42
Gordon Kamer authored Nov 19, 2024

bfd30f42
readme: add Gollama to community integrations (#7756) · 0ef17ede
Jonathan Hecl authored Nov 20, 2024

0ef17ede

Improve crash reporting (#7728) · 909a88c5

Daniel Hiltgen authored Nov 19, 2024

Many model crashes are masked behind "An existing connection was forcibly closed by the remote host"
This captures that common error message and wires in any detected errors from the log.

This also adds the deepseek context shift error to the known errors we capture.

909a88c5

expose underlying error on embedding failure (#7743) · f602ab4d
Daniel Hiltgen authored Nov 19, 2024
```
Avoid a round-trip asking users for logs to see what went wrong.
```
f602ab4d

19 Nov, 2024 5 commits

fix(runner): Set logits to 0 if false on Batch.Add · 807ace5b

Gabe Goodhart authored Nov 19, 2024

https://github.com/ollama/ollama/issues/7656


Branch: Granite3StoppingBug-7656
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

807ace5b

server: allow mixed-case model names on push, pull, cp, and create (#7676) · 4b8a2e34

Blake Mizerany authored Nov 19, 2024

This change allows for mixed-case model names to be pushed, pulled,
copied, and created, which was previously disallowed because the Ollama
registry was backed by a Docker registry that enforced a naming
convention that disallowed mixed-case names, which is no longer the
case.

This does not break existing, intended, behaviors.

Also, make TestCase test a story of creating, updating, pulling, and
copying a model with case variations, ensuring the model's manifest is
updated correctly, and not duplicated across different files with
different case variations.

4b8a2e34

Better error suppresion when getting terminal colours (#7739) · e66c2926
frob authored Nov 19, 2024
```
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
e66c2926
update the docs (#7731) · 712d63c3
Patrick Devine authored Nov 18, 2024

712d63c3
readme: add Alfred Ollama to community integrations (#7724) · 6cdf27d1
Patrick Sy authored Nov 19, 2024

6cdf27d1

18 Nov, 2024 4 commits
- Notify the user if systemd is not running (#6693) · 5c18e663
  frob authored Nov 19, 2024
```
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
  5c18e663
- win: add right click menu support (#7727) · 35096a7e
  Daniel Hiltgen authored Nov 18, 2024
```
Enable both left and right click on the pop-up menu
```
  35096a7e
- fix index out of range on zero layer metal load (#7696) · 81d55d3e
  Daniel Hiltgen authored Nov 18, 2024
```
If the model doesn't fit any layers on metal, and we load zero layers
we would panic trying to look up the GPU size during scheduling ops
```
  81d55d3e
- readme: improve Community Integrations section (#7718) · a14f7649
  Vinh Nguyen authored Nov 18, 2024
  
  a14f7649