- 10 Dec, 2024 3 commits
-
-
Daniel Hiltgen authored
The "F" was missing.
-
Daniel Hiltgen authored
* llama: wire up builtin runner This adds a new entrypoint into the ollama CLI to run the cgo built runner. On Mac arm64, this will have GPU support, but on all other platforms it will be the lowest common denominator CPU build. After we fully transition to the new Go runners more tech-debt can be removed and we can stop building the "default" runner via make and rely on the builtin always. * build: Make target improvements Add a few new targets and help for building locally. This also adjusts the runner lookup to favor local builds, then runners relative to the executable, and finally payloads. * Support customized CPU flags for runners This implements a simplified custom CPU flags pattern for the runners. When built without overrides, the runner name contains the vector flag we check for (AVX) to ensure we don't try to run on unsupported systems and crash. If the user builds a customized set, we omit the naming scheme and don't check for compatibility. This avoids checking requirements at runtime, so that logic has been removed as well. This can be used to build GPU runners with no vector flags, or CPU/GPU runners with additional flags (e.g. AVX512) enabled. * Use relative paths If the user checks out the repo in a path that contains spaces, make gets really confused so use relative paths for everything in-repo to avoid breakage. * Remove payloads from main binary * install: clean up prior libraries This removes support for v0.3.6 and older versions (before the tar bundle) and ensures we clean up prior libraries before extracting the bundle(s). Without this change, runners and dependent libraries could leak when we update and lead to subtle runtime errors.
-
frob authored
Co-authored-by:Richard Lyons <frob@cloudstaff.com>
-
- 09 Dec, 2024 1 commit
-
-
Jesse Gross authored
New lines can be an important part of a user's prompt and trimming it can alter the results. We previously only trimmed prompts with images but refactoring brought this behavior to all prompts, where it became more noticable. The /generate endpoint adds less whitespace and therefore doesn't need to trim it out - this brings the same behavior to /chat. Thanks to @gabe-l-hart for spotting the issue! Fixes #7795
-
- 08 Dec, 2024 2 commits
-
-
Yannick Gloster authored
-
湛露先生 authored
-
- 06 Dec, 2024 3 commits
-
-
Parth Sareen authored
-
Michael authored
readme: add llama3.3 to readme
-
Parth Sareen authored
-
- 05 Dec, 2024 3 commits
-
-
Jeffrey Morgan authored
-
Parth Sareen authored
-
Parth Sareen authored
Adds structured outputs to chat endpoint --------- Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Hieu Nguyen <hieunguyen1053@outlook.com>
-
- 04 Dec, 2024 3 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Sam authored
-
- 03 Dec, 2024 2 commits
- 02 Dec, 2024 2 commits
-
-
Tigran authored
-
David Mayboroda authored
-
- 30 Nov, 2024 3 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Parth Sareen authored
-
- 29 Nov, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 28 Nov, 2024 1 commit
-
-
TheCookingSenpai authored
-
- 27 Nov, 2024 3 commits
-
-
Parth Sareen authored
-
ItzCrazyKns authored
Closes #7627
-
Bruce MacDonald authored
The writeError takes a code argument which is no longer used. Remove it for clarity.
-
- 26 Nov, 2024 4 commits
-
-
Jesse Gross authored
When processing a prompt, we look for image tags of the form [img-0], which are inserted by the Ollama server process. However, this can cause errors if the original prompt has these tags - typically an image not found error is returned. This changes tag searching behavior to be similar to the 0.3.x series, which will largely avoid these problems. However,they can still happen when input text with these tags is used with image models. The correct solution is to escape the tags but this is a larger issue with special sequences in general so this is an incremental fix that should avoid the problem for the majority of cases.
-
Jesse Gross authored
This also makes it easier to truncate long inputs the same as shifting but does not actually implement it. This type of truncation has a trade off between quality and time to first token.
-
jake83741 authored
-
frob authored
-
- 25 Nov, 2024 4 commits
-
-
Blake Mizerany authored
This changes makeRequest to update the http client Transport if and only if testMakeRequestDialContext is set. This is to avoid overriding the default Transport when testMakeRequestDialContext is nil, which broke existing behavior, included proxies, timeouts, and other behaviors. Fixes #7829 Fixes #7788
-
Shikhar Bakhda authored
-
Bruce MacDonald authored
After a user pushes their model it is not clear what to do next. Add a link to the output of `ollama push` that tells the user where their model can now be found.
-
Simon Schampijer authored
- better formatting of input prompt - use invoke instead of predict
-
- 24 Nov, 2024 4 commits
-
-
reid41 authored
-
frob authored
-
Adarsh Mishra authored
-
Patcher authored
-
- 23 Nov, 2024 1 commit
-
-
Meng Zhuo authored
-