- 01 Jan, 2025 1 commit
-
-
Patrick Devine authored
Replaces `POST /api/create` to use JSON instead of a Modelfile. This is a breaking change.
-
- 23 Dec, 2024 1 commit
-
-
湛露先生 authored
-
- 15 Dec, 2024 1 commit
-
-
Patrick Devine authored
Refactor mllama image processing code, and add pixtral and qwen2vl
-
- 11 Dec, 2024 1 commit
-
-
Blake Mizerany authored
Fixes #7944
-
- 10 Dec, 2024 2 commits
-
-
frob authored
-
Daniel Hiltgen authored
* llama: wire up builtin runner This adds a new entrypoint into the ollama CLI to run the cgo built runner. On Mac arm64, this will have GPU support, but on all other platforms it will be the lowest common denominator CPU build. After we fully transition to the new Go runners more tech-debt can be removed and we can stop building the "default" runner via make and rely on the builtin always. * build: Make target improvements Add a few new targets and help for building locally. This also adjusts the runner lookup to favor local builds, then runners relative to the executable, and finally payloads. * Support customized CPU flags for runners This implements a simplified custom CPU flags pattern for the runners. When built without overrides, the runner name contains the vector flag we check for (AVX) to ensure we don't try to run on unsupported systems and crash. If the user builds a customized set, we omit the naming scheme and don't check for compatibility. This avoids checking requirements at runtime, so that logic has been removed as well. This can be used to build GPU runners with no vector flags, or CPU/GPU runners with additional flags (e.g. AVX512) enabled. * Use relative paths If the user checks out the repo in a path that contains spaces, make gets really confused so use relative paths for everything in-repo to avoid breakage. * Remove payloads from main binary * install: clean up prior libraries This removes support for v0.3.6 and older versions (before the tar bundle) and ensures we clean up prior libraries before extracting the bundle(s). Without this change, runners and dependent libraries could leak when we update and lead to subtle runtime errors.
-
- 05 Dec, 2024 2 commits
-
-
Parth Sareen authored
-
Parth Sareen authored
Adds structured outputs to chat endpoint --------- Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Hieu Nguyen <hieunguyen1053@outlook.com>
-
- 30 Nov, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Parth Sareen authored
-
- 27 Nov, 2024 1 commit
-
-
Parth Sareen authored
-
- 23 Nov, 2024 1 commit
-
-
oza6ut0ne authored
-
- 20 Nov, 2024 1 commit
-
-
Daniel Hiltgen authored
Avoid a round-trip asking users for logs to see what went wrong.
-
- 19 Nov, 2024 1 commit
-
-
Blake Mizerany authored
This change allows for mixed-case model names to be pushed, pulled, copied, and created, which was previously disallowed because the Ollama registry was backed by a Docker registry that enforced a naming convention that disallowed mixed-case names, which is no longer the case. This does not break existing, intended, behaviors. Also, make TestCase test a story of creating, updating, pulling, and copying a model with case variations, ensuring the model's manifest is updated correctly, and not duplicated across different files with different case variations.
-
- 05 Nov, 2024 1 commit
-
-
Daniel Hiltgen authored
One potential failure mode is an empty file which bubbles up as an EOF error, leading to all pulls and listing operations failing. Instead, continue and warn about the corrupt manifest. This also allows re-pulling the corrupt manifest to repair the system.
-
- 04 Nov, 2024 1 commit
-
-
Daniel Hiltgen authored
Avoid excessive log spew and make consistent with chat logging
-
- 30 Oct, 2024 1 commit
-
-
Jesse Gross authored
-Update mllama to take the cross attention state as embeddings in a batch, more similar to how Llava handles it. This improves integration with the input cache. -Pass locations in a prompt for embeddings using tags similar to Llava. -Abstract interface to vision models so the main runner accesses Clip and Mllama similarly Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 28 Oct, 2024 1 commit
-
-
Patrick Devine authored
-
- 18 Oct, 2024 1 commit
-
-
Patrick Devine authored
Co-authored-by:
jmorganca <jmorganca@gmail.com> Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Jesse Gross <jesse@ollama.com>
-
- 17 Oct, 2024 1 commit
-
-
Daniel Hiltgen authored
Cleaning up go package naming
-
- 01 Oct, 2024 1 commit
-
-
Alex Mavrogiannis authored
-
- 12 Sep, 2024 1 commit
-
-
Daniel Hiltgen authored
* Optimize container images for startup This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images. * Refactor payload logic and add buildx support for faster builds * Move payloads around * Review comments * Converge to buildx based helper scripts * Use docker buildx action for release
-
- 11 Sep, 2024 1 commit
-
-
Patrick Devine authored
-
- 27 Aug, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 13 Aug, 2024 1 commit
-
-
royjhan authored
* load on empty input * no load on invalid input
-
- 11 Aug, 2024 1 commit
-
-
Jeffrey Morgan authored
For simplicity, perform parallelization of embedding requests in the API handler instead of offloading this to the subprocess runner. This keeps the scheduling story simpler as it builds on existing parallel requests, similar to existing text completion functionality.
-
- 07 Aug, 2024 1 commit
-
-
Jesse Gross authored
Currently if the config field is missing in the manifest file (or corrupted), Ollama will crash when it tries to read it. This can happen at startup or when pulling new models. This data is mostly just used for showing model information so we can be tolerant of it not being present - it is not required to run the models. Besides avoiding crashing, this also gives us the ability to restructure the config in the future by pulling it into the main manifest file.
-
- 02 Aug, 2024 1 commit
-
-
Michael Yang authored
-
- 01 Aug, 2024 5 commits
-
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
Vyacheslav Moskalev authored
-
- 30 Jul, 2024 1 commit
-
-
royjhan authored
* add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * update tests * test name * list metrics
-
- 26 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 22 Jul, 2024 5 commits
-
-
Josh authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Jeffrey Morgan authored
-