Commits · 77ccbf04dc8d3854dc2c0aafe7d1d03a50fe81a0 · OpenDAS / ollama

02 Aug, 2024 1 commit
- lint · b732beba
  Michael Yang authored Aug 01, 2024
  
  b732beba
31 Jul, 2024 1 commit
- fix modelfile message quotes · d87b4a48
  Michael Yang authored Jul 31, 2024
  
  d87b4a48
26 Jul, 2024 1 commit
- include modelfile messages · 15af5584
  Michael Yang authored Jun 19, 2024
  
  15af5584
25 Jul, 2024 1 commit

server: reuse original download URL for images (#5962) · c8af3c2d

Blake Mizerany authored Jul 25, 2024

This changes the registry client to reuse the original download URL
it gets on the first redirect response for all subsequent requests,
preventing thundering herd issues when hot new LLMs are released.

c8af3c2d

22 Jul, 2024 1 commit
- bool · 55cd3ddc
  Michael Yang authored Jul 03, 2024
  
  55cd3ddc
19 Jul, 2024 1 commit
- server: validate template (#5734) · e8b954c6
  Josh authored Jul 19, 2024
```
add template validation to modelfile
```
  e8b954c6
16 Jul, 2024 1 commit

add suffix support to generate endpoint · d290e875

Michael Yang authored Jun 20, 2024

this change is triggered by the presence of "suffix", particularly
useful for code completion tasks

d290e875

15 Jul, 2024 1 commit
- tools · d02bbebb
  Michael Yang authored Jun 20, 2024
  
  d02bbebb
05 Jul, 2024 1 commit
- update message processing · 269ed6e6
  Michael Yang authored Jun 17, 2024
  
  269ed6e6
01 Jul, 2024 4 commits
- use kvs to detect embedding models · da8e2a04
  Michael Yang authored Jun 14, 2024
  
  da8e2a04
- add capabilities · a30915bd
  Michael Yang authored Jun 11, 2024
  
  a30915bd
- rename templates to template · 58e3fff3
  Michael Yang authored Jun 10, 2024
  
  58e3fff3
- remove ManifestV2 · 3f0b309a
  Michael Yang authored Jun 10, 2024
  
  3f0b309a
25 Jun, 2024 1 commit

llm: speed up gguf decoding by a lot (#5246) · cb42e607

Blake Mizerany authored Jun 24, 2024

Previously, some costly things were causing the loading of GGUF files
and their metadata and tensor information to be VERY slow:

  * Too many allocations when decoding strings
  * Hitting disk for each read of each key and value, resulting in a
    not-okay amount of syscalls/disk I/O.

The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro
m3.

This commit also prevents collecting large arrays of values when
decoding GGUFs (if desired). When such keys are encountered, their
values are null, and are encoded as such in JSON.

Also, this fixes a broken test that was not encoding valid GGUF.

cb42e607

21 Jun, 2024 1 commit
- fix: quantization with template · e835ef18
  Michael Yang authored Jun 21, 2024
  
  e835ef18
13 Jun, 2024 1 commit
- server: remove jwt decoding error (#5027) · 1fd236d1
  Jeffrey Morgan authored Jun 13, 2024
  
  1fd236d1
12 Jun, 2024 1 commit

fix: multiple templates when creating from model · c16f8af9

Michael Yang authored Jun 12, 2024

multiple templates may appear in a model if a model is created from
another model that 1) has an autodetected template and 2) defines a
custom template

c16f8af9

07 Jun, 2024 1 commit
- fix create model when template detection errors · 030e765e
  Michael Yang authored Jun 07, 2024
  
  030e765e
06 Jun, 2024 1 commit
- detect chat template from KV · 9b6c2e6e
  Michael Yang authored Jun 03, 2024
  
  9b6c2e6e
05 Jun, 2024 1 commit
- server: skip blob verification for already verified blobs · de5beb06
  Blake Mizerany authored May 24, 2024
  
  de5beb06
04 Jun, 2024 5 commits
- update create handler to use model.Name · d61ef8b9
  Michael Yang authored May 08, 2024
  
  d61ef8b9
- gofmt, goimports · 6297f856
  Michael Yang authored Jun 04, 2024
  
  6297f856
- lint · e40145a3
  Michael Yang authored May 21, 2024
  
  e40145a3
- nolintlint · 8ffb5174
  Michael Yang authored May 21, 2024
  
  8ffb5174
- replace x/exp/slices with slices · 04f3c12b
  Michael Yang authored May 21, 2024
  
  04f3c12b
24 May, 2024 1 commit
- Move envconfig and consolidate env vars (#4608) · 4cc3be30
  Patrick Devine authored May 24, 2024
  
  4cc3be30
20 May, 2024 4 commits
- fix quantize file types · 807d0927
  Michael Yang authored May 17, 2024
  
  807d0927
- tidy intermediate blobs · f36f1d6b
  Michael Yang authored May 20, 2024
  
  f36f1d6b
- cache and reuse intermediate blobs · 3520c0e4
  Michael Yang authored May 10, 2024
```
particularly useful for zipfiles and f16s
```
  3520c0e4
- Move the parser back + handle utf16 files (#4533) · ccdf0b2a
  Patrick Devine authored May 20, 2024
  
  ccdf0b2a
14 May, 2024 1 commit
- remove DeleteModel · b8772a35
  Michael Yang authored May 08, 2024
  
  b8772a35
09 May, 2024 1 commit
- prune partial downloads (#4272) · 302d7fdb
  Jeffrey Morgan authored May 09, 2024
  
  302d7fdb
08 May, 2024 2 commits
- routes: fix show llava models · b25976ae
  Michael Yang authored May 08, 2024
  
  b25976ae
- skip if same quantization · eeb69526
  Michael Yang authored May 07, 2024
  
  eeb69526
07 May, 2024 1 commit
- update list handler to use model.Name · 548a7df0
  Michael Yang authored Apr 17, 2024
  
  548a7df0
06 May, 2024 5 commits
- only quantize language models · d2454603
  Michael Yang authored Apr 25, 2024
  
  d2454603
- no iterator · 4d0d0fa3
  Michael Yang authored Apr 25, 2024
  
  4d0d0fa3
- rebase · 7ffe4573
  Michael Yang authored Apr 24, 2024
  
  7ffe4573
- comments · 01811c17
  Michael Yang authored Apr 23, 2024
  
  01811c17
- quantize any fp16/fp32 model · 9685c345
  Michael Yang authored Apr 12, 2024
```
- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}
```
  9685c345