- 01 Jan, 2025 1 commit
-
-
Patrick Devine authored
Replaces `POST /api/create` to use JSON instead of a Modelfile. This is a breaking change.
-
- 11 Dec, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 05 Dec, 2024 2 commits
-
-
Parth Sareen authored
-
Parth Sareen authored
Adds structured outputs to chat endpoint --------- Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Hieu Nguyen <hieunguyen1053@outlook.com>
-
- 30 Nov, 2024 1 commit
-
-
Parth Sareen authored
-
- 12 Nov, 2024 1 commit
-
-
Evan authored
-
- 06 Nov, 2024 1 commit
-
-
Jesse Gross authored
Now that server.cpp is gone, we don't need to keep passing arguments that were only ignored and only kept for compatibility.
-
- 28 Aug, 2024 1 commit
-
-
Michael Yang authored
-
- 06 Aug, 2024 1 commit
-
-
Chua Chee Seng authored
-
- 05 Aug, 2024 1 commit
-
-
Daniel Hiltgen authored
If the system has multiple numa nodes, enable numa support in llama.cpp If we detect numactl in the path, use that, else use the basic "distribute" mode.
-
- 30 Jul, 2024 1 commit
-
-
royjhan authored
* add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * update tests * test name * list metrics
-
- 29 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 27 Jul, 2024 1 commit
-
-
Tibor Schmidt authored
-
- 18 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 16 Jul, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
this change is triggered by the presence of "suffix", particularly useful for code completion tasks
-
Michael Yang authored
-
Jeffrey Morgan authored
* server: return empty slice on empty `/api/embed` request * fix tests
-
- 15 Jul, 2024 3 commits
-
-
Michael Yang authored
-
Jeffrey Morgan authored
-
royjhan authored
* Initial Batch Embedding * Revert "Initial Batch Embedding" This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29. * Initial Draft * mock up notes * api/embed draft * add server function * check normalization * clean up * normalization * playing around with truncate stuff * Truncation * Truncation * move normalization to go * Integration Test Template * Truncation Integration Tests * Clean up * use float32 * move normalize * move normalize test * refactoring * integration float32 * input handling and handler testing * Refactoring of legacy and new * clear comments * merge conflicts * touches * embedding type 64 * merge conflicts * fix hanging on single string * refactoring * test values * set context length * clean up * testing clean up * testing clean up * remove function closure * Revert "remove function closure" This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787. * remove function closure * remove redundant error check * clean up * more clean up * clean up
-
- 14 Jul, 2024 1 commit
-
-
Patrick Devine authored
-
- 02 Jul, 2024 1 commit
-
-
royjhan authored
* OpenAI v1 models * Refactor Writers * Add Test Co-Authored-By: Attila Kerekes * Credit Co-Author Co-Authored-By:
Attila Kerekes <439392+keriati@users.noreply.github.com> * Empty List Testing * Use Namespace for Ownedby * Update Test * Add back envconfig * v1/models docs * Use ModelName Parser * Test Names * Remove Docs * Clean Up * Test name Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> * Add Middleware for Chat and List * Testing Cleanup * Test with Fatal * Add functionality to chat test * OpenAI: /v1/models/{model} compatibility (#5028) * Retrieve Model * OpenAI Delete Model * Retrieve Middleware * Remove Delete from Branch * Update Test * Middleware Test File * Function name * Cleanup * Test Update * Test Update --------- Co-authored-by:
Attila Kerekes <439392+keriati@users.noreply.github.com> Co-authored-by: Jeffrey Morgan <jmorgan...
-
- 01 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
This uses nil as undefined for a cleaner implementation.
-
- 21 Jun, 2024 1 commit
-
-
Daniel Hiltgen authored
Add the new tristate parsing logic for the code path for modelfiles, as well as a unit test.
-
- 19 Jun, 2024 1 commit
-
-
royjhan authored
* API Show Extended * Initial Draft of Information Co-Authored-By:
Patrick Devine <pdevine@sonic.net> * Clean Up * Descriptive arg error messages and other fixes * Second Draft of Show with Projectors Included * Remove Chat Template * Touches * Prevent wrapping from files * Verbose functionality * Docs * Address Feedback * Lint * Resolve Conflicts * Function Name * Tests for api/show model info * Show Test File * Add Projector Test * Clean routes * Projector Check * Move Show Test * Touches * Doc update --------- Co-authored-by:
Patrick Devine <pdevine@sonic.net>
-
- 17 Jun, 2024 1 commit
-
-
Daniel Hiltgen authored
On Windows, recent llama.cpp changes make mmap slower in most cases, so default to off. This also implements a tri-state for use_mmap so we can detect the difference between a user provided value of true/false, or unspecified.
-
- 16 Jun, 2024 1 commit
-
-
royjhan authored
* Add Mod Time to Show * Error Handling
-
- 12 Jun, 2024 1 commit
-
-
Patrick Devine authored
-
- 06 Jun, 2024 1 commit
-
-
royjhan authored
* Remove false time fields * Struct Separation for List and Process * Remove Marshaler
-
- 04 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 14 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 10 May, 2024 1 commit
-
-
Jeffrey Morgan authored
* rename `--quantization` to `--quantize` * backwards * Update api/types.go Co-authored-by:
Michael Yang <mxyng@pm.me> --------- Co-authored-by:
Michael Yang <mxyng@pm.me>
-
- 09 May, 2024 3 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
-
- 07 May, 2024 1 commit
-
-
Eli Bendersky authored
* api: fill up API documentation Followup for #2878 Now that the documentation is more complete, mention it in the README. Updates #2840 * fix typo/lint * Update README.md Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> --------- Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com>
-
- 06 May, 2024 1 commit
-
-
Jackie Li authored
--------- Co-authored-by:Patrick Devine <patrick@infrahq.com>
-
- 29 Apr, 2024 1 commit
-
-
Patrick Devine authored
-