- 16 Jul, 2024 1 commit
-
-
Michael Yang authored
this change is triggered by the presence of "suffix", particularly useful for code completion tasks
-
- 15 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 05 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 01 Jul, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 25 Jun, 2024 1 commit
-
-
Blake Mizerany authored
Previously, some costly things were causing the loading of GGUF files and their metadata and tensor information to be VERY slow: * Too many allocations when decoding strings * Hitting disk for each read of each key and value, resulting in a not-okay amount of syscalls/disk I/O. The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro m3. This commit also prevents collecting large arrays of values when decoding GGUFs (if desired). When such keys are encountered, their values are null, and are encoded as such in JSON. Also, this fixes a broken test that was not encoding valid GGUF.
-
- 21 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 13 Jun, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 12 Jun, 2024 1 commit
-
-
Michael Yang authored
multiple templates may appear in a model if a model is created from another model that 1) has an autodetected template and 2) defines a custom template
-
- 07 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 06 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 05 Jun, 2024 1 commit
-
-
Blake Mizerany authored
-
- 04 Jun, 2024 5 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 24 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 20 May, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
Patrick Devine authored
-
- 14 May, 2024 1 commit
-
-
Michael Yang authored
-
- 09 May, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 08 May, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 07 May, 2024 1 commit
-
-
Michael Yang authored
-
- 06 May, 2024 5 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
- FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32}
-
- 05 May, 2024 1 commit
-
-
Daniel Hiltgen authored
This moves all the env var reading into one central module and logs the loaded config once at startup which should help in troubleshooting user server logs
-
- 01 May, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 30 Apr, 2024 1 commit
-
-
Bruce MacDonald authored
- return descriptive error messages when unauthorized to create blob or push a model - display the local public key associated with the request that was denied
-
- 29 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 26 Apr, 2024 1 commit
-
-
Blake Mizerany authored
-