- 11 Dec, 2024 9 commits
-
-
Daniel Hiltgen authored
Pass through the version override so the makefiles use it
-
Blake Mizerany authored
Previously we decoded and re-encoded JSON schemas during validation, which served no purpose since json.RawMessage already validates JSON syntax. Worse, the re-encoding lost field ordering from the original schema, which affects inference quality during step-by-step reasoning. While fixing this ordering issue by using json.RawMessage directly, testing revealed that schema_to_grammar (from llama.cpp) also fails to preserve field order during grammar generation. This appears to be the root cause of inference degradation. This change prevents us from mangling the user's original schema order, but we still need to address the ordering issue in schema_to_grammar. That will be a separate change. Updates #7978
-
Daniel Hiltgen authored
upload-artifacts strips off leading common paths so when the ./build/ artifacts were removed, the ./dist/windows-amd64 prefix became common and was stripped, making the later download-artifacts place them in the wrong location
-
Daniel Hiltgen authored
The new build embeds the arm runner in the main binary, so there is no longer a lib/ollama
-
Daniel Hiltgen authored
Remove no longer relevant build log dir
-
Jeffrey Morgan authored
-
Blake Mizerany authored
-
湛露先生 authored
Signed-off-by:zhanluxianshen <zhanluxianshen@163.com>
-
Phil Wornath authored
-
- 10 Dec, 2024 8 commits
-
-
Tao Zuhong authored
-
frob authored
-
Dr. Daniel Bender authored
-
Daniel Hiltgen authored
The final implementation of #7499 removed dynamic vector requirements in favor of a simpler filename based model, and this was left over logic that is no longer needed.
-
Stefan Weil authored
-
Daniel Hiltgen authored
The "F" was missing.
-
Daniel Hiltgen authored
* llama: wire up builtin runner This adds a new entrypoint into the ollama CLI to run the cgo built runner. On Mac arm64, this will have GPU support, but on all other platforms it will be the lowest common denominator CPU build. After we fully transition to the new Go runners more tech-debt can be removed and we can stop building the "default" runner via make and rely on the builtin always. * build: Make target improvements Add a few new targets and help for building locally. This also adjusts the runner lookup to favor local builds, then runners relative to the executable, and finally payloads. * Support customized CPU flags for runners This implements a simplified custom CPU flags pattern for the runners. When built without overrides, the runner name contains the vector flag we check for (AVX) to ensure we don't try to run on unsupported systems and crash. If the user builds a customized set, we omit the naming scheme and don't check for compatibility. This avoids checking requirements at runtime, so that logic has been removed as well. This can be used to build GPU runners with no vector flags, or CPU/GPU runners with additional flags (e.g. AVX512) enabled. * Use relative paths If the user checks out the repo in a path that contains spaces, make gets really confused so use relative paths for everything in-repo to avoid breakage. * Remove payloads from main binary * install: clean up prior libraries This removes support for v0.3.6 and older versions (before the tar bundle) and ensures we clean up prior libraries before extracting the bundle(s). Without this change, runners and dependent libraries could leak when we update and lead to subtle runtime errors.
-
frob authored
Co-authored-by:Richard Lyons <frob@cloudstaff.com>
-
- 09 Dec, 2024 1 commit
-
-
Jesse Gross authored
New lines can be an important part of a user's prompt and trimming it can alter the results. We previously only trimmed prompts with images but refactoring brought this behavior to all prompts, where it became more noticable. The /generate endpoint adds less whitespace and therefore doesn't need to trim it out - this brings the same behavior to /chat. Thanks to @gabe-l-hart for spotting the issue! Fixes #7795
-
- 08 Dec, 2024 2 commits
-
-
Yannick Gloster authored
-
湛露先生 authored
-
- 06 Dec, 2024 3 commits
-
-
Parth Sareen authored
-
Michael authored
readme: add llama3.3 to readme
-
Parth Sareen authored
-
- 05 Dec, 2024 3 commits
-
-
Jeffrey Morgan authored
-
Parth Sareen authored
-
Parth Sareen authored
Adds structured outputs to chat endpoint --------- Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Hieu Nguyen <hieunguyen1053@outlook.com>
-
- 04 Dec, 2024 3 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Sam authored
-
- 03 Dec, 2024 2 commits
- 02 Dec, 2024 2 commits
-
-
Tigran authored
-
David Mayboroda authored
-
- 30 Nov, 2024 3 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Parth Sareen authored
-
- 29 Nov, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 28 Nov, 2024 1 commit
-
-
TheCookingSenpai authored
-
- 27 Nov, 2024 2 commits
-
-
Parth Sareen authored
-
ItzCrazyKns authored
Closes #7627
-