- 22 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 19 Jul, 2024 1 commit
-
-
Josh authored
add template validation to modelfile
-
- 18 Jul, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* server: only parse tool calls if tools are provided * still set `resp.Message.Content`
-
- 16 Jul, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
this change is triggered by the presence of "suffix", particularly useful for code completion tasks
-
royjhan authored
* OpenAI v1 models * Empty List Testing * Add back envconfig * v1/models docs * Remove Docs * OpenAI batch embed compatibility * merge conflicts * integrate with api/embed * ep * merge conflicts * request tests * rm resp test * merge conflict * merge conflict * test fixes * test fn renaming * input validation for empty string --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-
Jeffrey Morgan authored
-
- 15 Jul, 2024 2 commits
-
-
Michael Yang authored
-
royjhan authored
* Initial Batch Embedding * Revert "Initial Batch Embedding" This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29. * Initial Draft * mock up notes * api/embed draft * add server function * check normalization * clean up * normalization * playing around with truncate stuff * Truncation * Truncation * move normalization to go * Integration Test Template * Truncation Integration Tests * Clean up * use float32 * move normalize * move normalize test * refactoring * integration float32 * input handling and handler testing * Refactoring of legacy and new * clear comments * merge conflicts * touches * embedding type 64 * merge conflicts * fix hanging on single string * refactoring * test values * set context length * clean up * testing clean up * testing clean up * remove function closure * Revert "remove function closure" This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787. * remove function closure * remove redundant error check * clean up * more clean up * clean up
-
- 14 Jul, 2024 1 commit
-
-
Patrick Devine authored
-
- 13 Jul, 2024 2 commits
-
-
jmorganca authored
-
Jeffrey Morgan authored
* server: fix `contet`, `load_duration` and `total_duration` fields * Update server/routes.go
-
- 05 Jul, 2024 3 commits
-
-
Michael Yang authored
ensure runtime model changes (template, system prompt, messages, options) are captured on model updates without needing to reload the server
-
Michael Yang authored
-
Michael Yang authored
-
- 03 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
This change fixes the handling of keep_alive so that if client request omits the setting, we only set this on initial load. Once the model is loaded, if new requests leave this unset, we'll keep whatever keep_alive was there.
-
- 02 Jul, 2024 3 commits
-
-
Michael Yang authored
-
royjhan authored
* OpenAI v1 models * Refactor Writers * Add Test Co-Authored-By: Attila Kerekes * Credit Co-Author Co-Authored-By:
Attila Kerekes <439392+keriati@users.noreply.github.com> * Empty List Testing * Use Namespace for Ownedby * Update Test * Add back envconfig * v1/models docs * Use ModelName Parser * Test Names * Remove Docs * Clean Up * Test name Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> * Add Middleware for Chat and List * Completions Endpoint * Testing Cleanup * Test with Fatal * Add functionality to chat test * Rename function * float types * type cleanup * cleaning * more cleaning * Extra test cases * merge conflicts * merge conflicts * merge conflicts * merge conflicts * cleaning * cleaning --------- Co-authored-by:
Attila Kerekes <439392+keriati@users.noreply.github.com> Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com>
-
royjhan authored
* OpenAI v1 models * Refactor Writers * Add Test Co-Authored-By: Attila Kerekes * Credit Co-Author Co-Authored-By:
Attila Kerekes <439392+keriati@users.noreply.github.com> * Empty List Testing * Use Namespace for Ownedby * Update Test * Add back envconfig * v1/models docs * Use ModelName Parser * Test Names * Remove Docs * Clean Up * Test name Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> * Add Middleware for Chat and List * Testing Cleanup * Test with Fatal * Add functionality to chat test * OpenAI: /v1/models/{model} compatibility (#5028) * Retrieve Model * OpenAI Delete Model * Retrieve Middleware * Remove Delete from Branch * Update Test * Middleware Test File * Function name * Cleanup * Test Update * Test Update --------- Co-authored-by:
Attila Kerekes <439392+keriati@users.noreply.github.com> Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com>
-
- 01 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 25 Jun, 2024 1 commit
-
-
Blake Mizerany authored
Previously, some costly things were causing the loading of GGUF files and their metadata and tensor information to be VERY slow: * Too many allocations when decoding strings * Hitting disk for each read of each key and value, resulting in a not-okay amount of syscalls/disk I/O. The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro m3. This commit also prevents collecting large arrays of values when decoding GGUFs (if desired). When such keys are encountered, their values are null, and are encoded as such in JSON. Also, this fixes a broken test that was not encoding valid GGUF.
-
- 21 Jun, 2024 1 commit
-
-
Daniel Hiltgen authored
Provide consistent ordering for the ps command - longest duration listed first
-
- 19 Jun, 2024 1 commit
-
-
royjhan authored
* API Show Extended * Initial Draft of Information Co-Authored-By:
Patrick Devine <pdevine@sonic.net> * Clean Up * Descriptive arg error messages and other fixes * Second Draft of Show with Projectors Included * Remove Chat Template * Touches * Prevent wrapping from files * Verbose functionality * Docs * Address Feedback * Lint * Resolve Conflicts * Function Name * Tests for api/show model info * Show Test File * Add Projector Test * Clean routes * Projector Check * Move Show Test * Touches * Doc update --------- Co-authored-by:
Patrick Devine <pdevine@sonic.net>
-
- 16 Jun, 2024 1 commit
-
-
royjhan authored
* Add Mod Time to Show * Error Handling
-
- 06 Jun, 2024 2 commits
- 04 Jun, 2024 3 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 24 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 20 May, 2024 3 commits
-
-
Michael Yang authored
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
Patrick Devine authored
-
- 16 May, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 15 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 14 May, 2024 3 commits
-
-
Patrick Devine authored
-
Michael Yang authored
-
Michael Yang authored
-