- 31 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 30 Jul, 2024 2 commits
-
-
royjhan authored
* add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * update tests * test name * list metrics
-
Daniel Hiltgen authored
In mult-brand GPU setups, if we couldn't fully load the model we would fall through the scheduler and mistakenly try to load across a mix of brands. This makes sure we find the set of GPU(s) that best fit for the partial load.
-
- 26 Jul, 2024 3 commits
-
-
Blake Mizerany authored
This fixes various data races scattered throughout the download/pull client where the client was accessing the download state concurrently. This commit is mostly a hot-fix and will be replaced by a new client one day soon. Also, remove the unnecessary opts argument from downloadChunk.
-
Michael Yang authored
-
Michael Yang authored
-
- 25 Jul, 2024 1 commit
-
-
Blake Mizerany authored
This changes the registry client to reuse the original download URL it gets on the first redirect response for all subsequent requests, preventing thundering herd issues when hot new LLMs are released.
-
- 22 Jul, 2024 10 commits
-
-
Josh authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 21 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 20 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 19 Jul, 2024 1 commit
-
-
Josh authored
add template validation to modelfile
-
- 18 Jul, 2024 3 commits
-
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* server: only parse tool calls if tools are provided * still set `resp.Message.Content`
-
- 17 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 16 Jul, 2024 9 commits
-
-
Michael Yang authored
-
Michael Yang authored
this change is triggered by the presence of "suffix", particularly useful for code completion tasks
-
Michael Yang authored
-
royjhan authored
* OpenAI v1 models * Empty List Testing * Add back envconfig * v1/models docs * Remove Docs * OpenAI batch embed compatibility * merge conflicts * integrate with api/embed * ep * merge conflicts * request tests * rm resp test * merge conflict * merge conflict * test fixes * test fn renaming * input validation for empty string --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Michael Yang authored
-
Jeffrey Morgan authored
* server: return empty slice on empty `/api/embed` request * fix tests
-
Michael Yang authored
-
- 15 Jul, 2024 2 commits
-
-
Michael Yang authored
-
royjhan authored
* Initial Batch Embedding * Revert "Initial Batch Embedding" This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29. * Initial Draft * mock up notes * api/embed draft * add server function * check normalization * clean up * normalization * playing around with truncate stuff * Truncation * Truncation * move normalization to go * Integration Test Template * Truncation Integration Tests * Clean up * use float32 * move normalize * move normalize test * refactoring * integration float32 * input handling and handler testing * Refactoring of legacy and new * clear comments * merge conflicts * touches * embedding type 64 * merge conflicts * fix hanging on single string * refactoring * test values * set context length * clean up * testing clean up * testing clean up * remove function closure * Revert "remove function closure" This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787. * remove function closure * remove redundant error check * clean up * more clean up * clean up
-
- 14 Jul, 2024 1 commit
-
-
Patrick Devine authored
-
- 13 Jul, 2024 3 commits
-
-
jmorganca authored
-
Jeffrey Morgan authored
* server: fix `contet`, `load_duration` and `total_duration` fields * Update server/routes.go
-
Michael Yang authored
* fix system prompt * execute template when hitting previous roles * fix tests --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-