- 30 Oct, 2024 1 commit
-
-
Jesse Gross authored
-Update mllama to take the cross attention state as embeddings in a batch, more similar to how Llava handles it. This improves integration with the input cache. -Pass locations in a prompt for embeddings using tags similar to Llava. -Abstract interface to vision models so the main runner accesses Clip and Mllama similarly Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 28 Oct, 2024 1 commit
-
-
Patrick Devine authored
-
- 18 Oct, 2024 1 commit
-
-
Patrick Devine authored
Co-authored-by:
jmorganca <jmorganca@gmail.com> Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Jesse Gross <jesse@ollama.com>
-
- 17 Oct, 2024 1 commit
-
-
Daniel Hiltgen authored
Cleaning up go package naming
-
- 01 Oct, 2024 1 commit
-
-
Alex Mavrogiannis authored
-
- 12 Sep, 2024 1 commit
-
-
Daniel Hiltgen authored
* Optimize container images for startup This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images. * Refactor payload logic and add buildx support for faster builds * Move payloads around * Review comments * Converge to buildx based helper scripts * Use docker buildx action for release
-
- 11 Sep, 2024 1 commit
-
-
Patrick Devine authored
-
- 27 Aug, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 13 Aug, 2024 1 commit
-
-
royjhan authored
* load on empty input * no load on invalid input
-
- 11 Aug, 2024 1 commit
-
-
Jeffrey Morgan authored
For simplicity, perform parallelization of embedding requests in the API handler instead of offloading this to the subprocess runner. This keeps the scheduling story simpler as it builds on existing parallel requests, similar to existing text completion functionality.
-
- 07 Aug, 2024 1 commit
-
-
Jesse Gross authored
Currently if the config field is missing in the manifest file (or corrupted), Ollama will crash when it tries to read it. This can happen at startup or when pulling new models. This data is mostly just used for showing model information so we can be tolerant of it not being present - it is not required to run the models. Besides avoiding crashing, this also gives us the ability to restructure the config in the future by pulling it into the main manifest file.
-
- 02 Aug, 2024 1 commit
-
-
Michael Yang authored
-
- 01 Aug, 2024 5 commits
-
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
- 30 Jul, 2024 1 commit
-
-
royjhan authored
* add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * update tests * test name * list metrics
-
- 26 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 22 Jul, 2024 5 commits
-
-
Josh authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Jeffrey Morgan authored
-
- 19 Jul, 2024 1 commit
-
-
Josh authored
add template validation to modelfile
-
- 18 Jul, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* server: only parse tool calls if tools are provided * still set `resp.Message.Content`
-
- 16 Jul, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
this change is triggered by the presence of "suffix", particularly useful for code completion tasks
-
royjhan authored
* OpenAI v1 models * Empty List Testing * Add back envconfig * v1/models docs * Remove Docs * OpenAI batch embed compatibility * merge conflicts * integrate with api/embed * ep * merge conflicts * request tests * rm resp test * merge conflict * merge conflict * test fixes * test fn renaming * input validation for empty string --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-
Jeffrey Morgan authored
-
- 15 Jul, 2024 2 commits
-
-
Michael Yang authored
-
royjhan authored
* Initial Batch Embedding * Revert "Initial Batch Embedding" This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29. * Initial Draft * mock up notes * api/embed draft * add server function * check normalization * clean up * normalization * playing around with truncate stuff * Truncation * Truncation * move normalization to go * Integration Test Template * Truncation Integration Tests * Clean up * use float32 * move normalize * move normalize test * refactoring * integration float32 * input handling and handler testing * Refactoring of legacy and new * clear comments * merge conflicts * touches * embedding type 64 * merge conflicts * fix hanging on single string * refactoring * test values * set context length * clean up * testing clean up * testing clean up * remove function closure * Revert "remove function closure" This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787. * remove function closure * remove redundant error check * clean up * more clean up * clean up
-
- 14 Jul, 2024 1 commit
-
-
Patrick Devine authored
-
- 13 Jul, 2024 2 commits
-
-
jmorganca authored
-
Jeffrey Morgan authored
* server: fix `contet`, `load_duration` and `total_duration` fields * Update server/routes.go
-
- 05 Jul, 2024 3 commits
-
-
Michael Yang authored
ensure runtime model changes (template, system prompt, messages, options) are captured on model updates without needing to reload the server
-
Michael Yang authored
-
Michael Yang authored
-
- 03 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
This change fixes the handling of keep_alive so that if client request omits the setting, we only set this on initial load. Once the model is loaded, if new requests leave this unset, we'll keep whatever keep_alive was there.
-