- 15 Jul, 2024 2 commits
-
-
Jeffrey Morgan authored
-
royjhan authored
* Initial Batch Embedding * Revert "Initial Batch Embedding" This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29. * Initial Draft * mock up notes * api/embed draft * add server function * check normalization * clean up * normalization * playing around with truncate stuff * Truncation * Truncation * move normalization to go * Integration Test Template * Truncation Integration Tests * Clean up * use float32 * move normalize * move normalize test * refactoring * integration float32 * input handling and handler testing * Refactoring of legacy and new * clear comments * merge conflicts * touches * embedding type 64 * merge conflicts * fix hanging on single string * refactoring * test values * set context length * clean up * testing clean up * testing clean up * remove function closure * Revert "remove function closure" This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787. * remove function closure * remove redundant error check * clean up * more clean up * clean up
-
- 14 Jul, 2024 2 commits
-
-
royjhan authored
* OpenAI v1 models * Refactor Writers * Add Test Co-Authored-By: Attila Kerekes * Credit Co-Author Co-Authored-By:
Attila Kerekes <439392+keriati@users.noreply.github.com> * Empty List Testing * Use Namespace for Ownedby * Update Test * Add back envconfig * v1/models docs * Use ModelName Parser * Test Names * Remove Docs * Clean Up * Test name Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com> * Add Middleware for Chat and List * Testing Cleanup * Test with Fatal * Add functionality to chat test * Support image input for OpenAI chat * Decoding * Fix message processing logic * openai vision test * type errors * clean up * redundant check * merge conflicts * merge conflicts * merge conflicts * flattening and smaller image * add test * support python and js SDKs and mandate prefixing * clean up --------- Co-authored-by:
Attila Kerekes <439392+keriati@users.noreply.github.com> Co-authored-by:
Jeffrey Morgan <jmorganca@gmail.com>
-
Patrick Devine authored
-
- 13 Jul, 2024 5 commits
-
-
jmorganca authored
-
Jeffrey Morgan authored
* server: fix `contet`, `load_duration` and `total_duration` fields * Update server/routes.go
-
Jeffrey Morgan authored
-
Jarek authored
-
Michael Yang authored
* fix system prompt * execute template when hitting previous roles * fix tests --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-
- 12 Jul, 2024 9 commits
-
-
Patrick Devine authored
This reverts commit 9ac0a7a5.
-
Patrick Devine authored
-
Michael Yang authored
template: preprocess message and collect system
-
Jeffrey Morgan authored
-
Michael Yang authored
-
Jeffrey Morgan authored
* app: always clean up install dir; force close applications * remove wildcard * revert `CloseApplications` * whitespace * update `LOCALAPPDATA` var
-
Michael Yang authored
-
Josh authored
-
Michael Yang authored
-
- 11 Jul, 2024 8 commits
-
-
Jeffrey Morgan authored
* llm: avoid loading model if system memory is too small * update log * Instrument swap free space On linux and windows, expose how much swap space is available so we can take that into consideration when scheduling models * use `systemSwapFreeMemory` in check --------- Co-authored-by:Daniel Hiltgen <daniel@ollama.com>
-
Michael Yang authored
This reverts commit 19753c18. for compat. messages will be added at a later date
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Michael Yang authored
update embedded templates
-
Michael Yang authored
-
Michael Yang authored
-
- 10 Jul, 2024 10 commits
-
-
Michael Yang authored
chatglm graph
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Wire up windows AMD driver reporting
-
Daniel Hiltgen authored
Bundle missing CRT libraries
-
Daniel Hiltgen authored
Detect CUDA OS overhead
-
Daniel Hiltgen authored
Bump ROCm on windows to 6.1.2
-
Daniel Hiltgen authored
Remove duplicate merge glitch
-
Daniel Hiltgen authored
This also adjusts our algorithm to favor our bundled ROCm. I've confirmed VRAM reporting still doesn't work properly so we can't yet enable concurrency by default.
-
Daniel Hiltgen authored
-
- 09 Jul, 2024 4 commits
-
-
Daniel Hiltgen authored
Workaround broken ROCm p2p copy
-
royjhan authored
* stop token parsing fix * add stop test
-
royjhan authored
-
Daniel Hiltgen authored
This adds logic to detect skew between the driver and management library which can be attributed to OS overhead and records that so we can adjust subsequent management library free VRAM updates and avoid OOM scenarios.
-