Commits · b3e5491e41811294de9d81649a96581af6522d08 · OpenDAS / ollama

22 Jul, 2024 1 commit
- server: collect nested tool call objects when parsing (#5824) · b3e5491e
  Jeffrey Morgan authored Jul 22, 2024
  
  b3e5491e
18 Jul, 2024 1 commit
- fix parsing tool calls · 43606d6d
  Michael Yang authored Jul 18, 2024
  
  43606d6d
17 Jul, 2024 2 commits
- marshal json automatically for some template values (#5758) · b2554455
  Michael Yang authored Jul 17, 2024
  
  b2554455
- parse tool call as individual objects · 5fd69881
  Michael Yang authored Jul 17, 2024
  
  5fd69881
16 Jul, 2024 2 commits
- remove unneeded tool calls · 5a83f79a
  Michael Yang authored Jul 16, 2024
  
  5a83f79a
- fix unmarshal type errors · 5afbb60f
  Michael Yang authored Jul 16, 2024
  
  5afbb60f
15 Jul, 2024 1 commit
- tools · d02bbebb
  Michael Yang authored Jun 20, 2024
  
  d02bbebb
01 Jul, 2024 2 commits
- err on insecure path · 88bcd79b
  Michael Yang authored Jun 30, 2024
  
  88bcd79b
- rename templates to template · 58e3fff3
  Michael Yang authored Jun 10, 2024
  
  58e3fff3
27 Jun, 2024 1 commit
- zip: prevent extracting files into parent dirs (#5314) · 123a722a
  Michael Yang authored Jun 26, 2024
  
  123a722a
25 Jun, 2024 1 commit

llm: speed up gguf decoding by a lot (#5246) · cb42e607

Blake Mizerany authored Jun 24, 2024

Previously, some costly things were causing the loading of GGUF files
and their metadata and tensor information to be VERY slow:

  * Too many allocations when decoding strings
  * Hitting disk for each read of each key and value, resulting in a
    not-okay amount of syscalls/disk I/O.

The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro
m3.

This commit also prevents collecting large arrays of values when
decoding GGUFs (if desired). When such keys are encountered, their
values are null, and are encoded as such in JSON.

Also, this fixes a broken test that was not encoding valid GGUF.

cb42e607

12 Jun, 2024 1 commit

fix: multiple templates when creating from model · c16f8af9

Michael Yang authored Jun 12, 2024

multiple templates may appear in a model if a model is created from
another model that 1) has an autodetected template and 2) defines a
custom template

c16f8af9

04 Jun, 2024 2 commits
- update create handler to use model.Name · d61ef8b9
  Michael Yang authored May 08, 2024
  
  d61ef8b9
- lint · e40145a3
  Michael Yang authored May 21, 2024
  
  e40145a3
20 May, 2024 2 commits
- tidy intermediate blobs · f36f1d6b
  Michael Yang authored May 20, 2024
  
  f36f1d6b
- cache and reuse intermediate blobs · 3520c0e4
  Michael Yang authored May 10, 2024
```
particularly useful for zipfiles and f16s
```
  3520c0e4
06 May, 2024 5 commits
- close zip files · b2f00aa9
  Michael Yang authored May 06, 2024
  
  b2f00aa9
- s/DisplayLongest/String/ · f5e8b207
  Michael Yang authored May 01, 2024
  
  f5e8b207
- no iterator · 4d0d0fa3
  Michael Yang authored Apr 25, 2024
  
  4d0d0fa3
- comments · 01811c17
  Michael Yang authored Apr 23, 2024
  
  01811c17
- quantize any fp16/fp32 model · 9685c345
  Michael Yang authored Apr 12, 2024
```
- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}
```
  9685c345