Commits · 237dccba1edb41bb65ed1ffc6eafdd40dd6085e4 · OpenDAS / ollama

14 Aug, 2024 2 commits
- skip invalid manifest files · 237dccba
  Michael Yang authored Aug 14, 2024
  
  237dccba
- fix noprune · b3f75fc8
  Michael Yang authored Aug 14, 2024
  
  b3f75fc8
13 Aug, 2024 3 commits

server: reduce max connections used in download (#6347) · 8e1050f3

Blake Mizerany authored Aug 13, 2024

The previous value of 64 was WAY too high and unnecessary. It reached
diminishing returns and blew past it. This is a more reasonable number
for _most_ normal cases. For users on cloud servers with excellent
network quality, this will keep screaming for them, without hitting our
CDN limits. For users with relatively poor network quality, this will
keep them from saturating their network and causing other issues.

8e1050f3

lint · 2697d7f5

Michael Yang authored Aug 13, 2024

- fixes printf: non-constant format string in call to fmt.Printf
- fixes SA1032: arguments have the wrong order
- disables testifylint

2697d7f5

Load Embedding Model on Empty Input (#6325) · 8b00a415
royjhan authored Aug 13, 2024
```
* load on empty input

* no load on invalid input
```
8b00a415

12 Aug, 2024 3 commits
- cmd: speed up gguf creates (#6324) · 980dd15f
  Josh authored Aug 12, 2024
  
  980dd15f
- Revert "server: speed up single gguf creates (#5898)" (#6323) · 1dc3ef3a
  Josh authored Aug 12, 2024
```
This reverts commit 8aac2243.
```
  1dc3ef3a
- server: speed up single gguf creates (#5898) · 8aac2243
  Josh authored Aug 12, 2024
  
  8aac2243
11 Aug, 2024 1 commit

server: parallelize embeddings in API web handler instead of in subprocess runner (#6220) · 15c2d8fe

Jeffrey Morgan authored Aug 11, 2024

For simplicity, perform parallelization of embedding requests in the API handler instead of offloading this to the subprocess runner. This keeps the scheduling story simpler as it builds on existing parallel requests, similar to existing text completion functionality.

15c2d8fe

09 Aug, 2024 1 commit
- Don't hard fail on sparse setup error · 2fa1db43
  Daniel Hiltgen authored Aug 09, 2024
```
It seems this can fail in some casees, but proceed
with the download anyway.
```
  2fa1db43
08 Aug, 2024 2 commits

server/download.go: Fix a typo in log · 7b61eba4
Jitang Lei authored Aug 08, 2024
```
Signed-off-by: Jitang Lei <leijitang@outlook.com>
```
7b61eba4

manifest: Store layers inside manifests consistently as values. · 7edaf6e7

Jesse Gross authored Aug 07, 2024

Commit 1829fb61 ("manifest: Fix crash on startup when trying to clean up
unused files (#5840)") changed the config layer stored in manifests
from a pointer to a value. This was done in order to avoid potential
nil pointer dereferences after it is deserialized from JSON in the
event that the field is missing.

This changes the Layers slice to also be stored by value. This enables
consistency in handling across the two objects.

7edaf6e7

07 Aug, 2024 3 commits

image: Clarify argument to WriteManifest is config · 97ec8cfd

Jesse Gross authored Aug 07, 2024

When creating a model the config layer is appended to the list of
layers and then the last layer is used as the config when writing the
manifest. This change directly uses the config layer to write the
manifest. There is no behavior change but it is less error prone.

97ec8cfd

manifest: Fix crash on startup when trying to clean up unused files (#5840) · 1829fb61

Jesse Gross authored Aug 05, 2024

Currently if the config field is missing in the manifest file (or
corrupted), Ollama will crash when it tries to read it. This can
happen at startup or when pulling new models.

This data is mostly just used for showing model information so we
can be tolerant of it not being present - it is not required to
run the models. Besides avoiding crashing, this also gives us the
ability to restructure the config in the future by pulling it
into the main manifest file.

1829fb61

manifest: Don't prune layers if we can't open a manifest file · 685a5353

Jesse Gross authored Aug 01, 2024

If there is an error when opening a manifest file (corrupted, permission denied, etc.)
then the referenced layers will not be included in the list of active
layers. This causes them to be deleted when pruning happens at startup
or a model is pulled.

In such a situation, we should prefer to preserve data in the hopes that
it can be recovered rather than being agressive about deletion.

685a5353

06 Aug, 2024 1 commit

Ensure sparse files on windows during download · fc85f50a

Daniel Hiltgen authored Aug 06, 2024

The file.Truncate call on windows will write the whole file
unless you set the sparse flag, leading to heavy I/O at the
beginning of download.  This should improve our
I/O behavior on windows and put less stress on the users disk.

fc85f50a

02 Aug, 2024 2 commits
- use testing tempdirs · a091fadf
  Michael Yang authored Aug 02, 2024
  
  a091fadf
- lint · b732beba
  Michael Yang authored Aug 01, 2024
  
  b732beba
01 Aug, 2024 5 commits
- Refactor and format code. · 8a9f946c
  Vyacheslav Moskalev authored Aug 02, 2024
  
  8a9f946c
- Refactor code. Remove extra variable. · 3b521054
  Vyacheslav Moskalev authored Aug 01, 2024
  
  3b521054
- Better types and naming closer to style. · b0c21658
  Vyacheslav Moskalev authored Aug 01, 2024
  
  b0c21658
- Change the order of context and prompt. · 49a54831
  Vyacheslav Moskalev authored Aug 01, 2024
  
  49a54831
- Fix extra context concatenation in generate handler (#5980). · 6bc5c137
  Vyacheslav Moskalev authored Aug 01, 2024
  
  6bc5c137
31 Jul, 2024 5 commits
- fix modelfile message quotes · d87b4a48
  Michael Yang authored Jul 31, 2024
  
  d87b4a48
- server: fix json marshalling of downloadBlobPart (#6108) · dc77bbcf
  Blake Mizerany authored Jul 31, 2024
  
  dc77bbcf
- convert: only extract large files · eafc607a
  Michael Yang authored Jun 29, 2024
  
  eafc607a
- comments · df993fa3
  Michael Yang authored Jul 08, 2024
  
  df993fa3
- refactor convert · 5e9db9fb
  Michael Yang authored May 31, 2024
  
  5e9db9fb
30 Jul, 2024 2 commits

Add Metrics to `api\embed` response (#5709) · 1b44d873

royjhan authored Jul 30, 2024

* add prompt tokens to embed response

* rm slog

* metrics

* types

* prompt n

* clean up

* reset submodule

* update tests

* test name

* list metrics

1b44d873

Prevent partial loading on mixed GPU brands · 34542099

Daniel Hiltgen authored Jul 22, 2024

In mult-brand GPU setups, if we couldn't fully load the model we
would fall through the scheduler and mistakenly try to load across
a mix of brands.  This makes sure we find the set of GPU(s) that
best fit for the partial load.

34542099

26 Jul, 2024 3 commits

server: fix race conditions during download (#5994) · 750c1c55

Blake Mizerany authored Jul 26, 2024

This fixes various data races scattered throughout the download/pull
client where the client was accessing the download state concurrently.

This commit is mostly a hot-fix and will be replaced by a new client one
day soon.

Also, remove the unnecessary opts argument from downloadChunk.

750c1c55

fix nil deref in auth.go · a622c47b
Michael Yang authored Jul 26, 2024

a622c47b
include modelfile messages · 15af5584
Michael Yang authored Jun 19, 2024

15af5584

25 Jul, 2024 1 commit

server: reuse original download URL for images (#5962) · c8af3c2d

Blake Mizerany authored Jul 25, 2024

This changes the registry client to reuse the original download URL
it gets on the first redirect response for all subsequent requests,
preventing thundering herd issues when hot new LLMs are released.

c8af3c2d

22 Jul, 2024 6 commits
- fix dupe err message (#5857) · db0968f3
  Josh authored Jul 22, 2024
  
  db0968f3
- comments · 85d9d73a
  Michael Yang authored Jul 08, 2024
  
  85d9d73a
- uint64 · 1954ec59
  Michael Yang authored Jul 03, 2024
  
  1954ec59
- int · 0f191012
  Michael Yang authored Jul 03, 2024
  
  0f191012
- keepalive · 8570c1c0
  Michael Yang authored Jul 03, 2024
  
  8570c1c0
- bool · 55cd3ddc
  Michael Yang authored Jul 03, 2024
  
  55cd3ddc