- 29 Dec, 2024 2 commits
-
-
Anas Khan authored
Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Jeffrey Morgan authored
-
- 28 Dec, 2024 1 commit
-
-
Emilien Lancelot authored
-
- 27 Dec, 2024 2 commits
-
-
CIIDMike authored
-
Adarsh Mishra authored
-
- 25 Dec, 2024 2 commits
-
-
Jared Donnell authored
-
aritra saha authored
-
- 23 Dec, 2024 3 commits
-
-
Emanuil Rusev authored
-
湛露先生 authored
-
ItzCrazyKns authored
-
- 22 Dec, 2024 1 commit
-
-
Patrick Devine authored
-
- 20 Dec, 2024 2 commits
-
-
Patrick Devine authored
-
Squishedmac authored
-
- 19 Dec, 2024 1 commit
-
-
Parth Sareen authored
This change adds a test to catch a regression in schema_to_grammar where the order of keys in the JSON schema is not preserved in the generated grammar, which is critical for step-by-step reasoning.
-
- 18 Dec, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Dec, 2024 6 commits
-
-
Jesse Gross authored
Sometimes the KV cache requires defragmentation even without triggering the threshold heuristic. In this case, decoding will not being able to find a KV cache slot. This is particularly difficult for the caller to handle if it happens in between ubatches. To avoid this, we should immediately trigger a defrag. In addition, a heavily fragmented cache can require more than max_moves to defragment. Currently, we stop when we hit the limit but this can leave a cache that still does not have adequate space even after defragmentation is triggered. Instead, we should do multiple batches of processing until everything is complete. Fixes #7949
-
Blake Mizerany authored
This fixes another regression in the previous commit that fixed other known bugs.
-
Jascha Beste authored
-
Blake Mizerany authored
Changes in #8002 introduced fixes for bugs with mangling JSON Schemas. It also fixed a bug where the server would silently fail when clients requested invalid formats. It also, unfortunately, introduced a bug where the server would reject requests with an empty format, which should be allowed. The change in #8127 updated the code to allow the empty format, but also reintroduced the regression where the server would silently fail when the format was set, but invalid. This commit fixes both regressions. The server does not reject the empty format, but it does reject invalid formats. It also adds tests to help us catch regressions in the future. Also, the updated code provides a more detailed error message when a client sends a non-empty, but invalid format, echoing the invalid format in the response. This commits also takes the opportunity to remove superfluous linter checks.
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
In 0.5.2 we simplified packaging to have avx only for macos x86. It looks like there may still be some non-AVX systems out there, so this puts back the prior logic of building no-AVX for the primary binary, and now 2 runners for avx and avx2. These will be packaged in the App bundle only, so the stand-alone binary will now be without AVX support on macos. On arm, we'll also see these runners reported as available in the log, but they're dormant and will never be used at runtime.
-
- 16 Dec, 2024 2 commits
-
-
Michael authored
readme: example/get started guide for pgai with Ollama
-
Jascha Beste authored
* docs: switch around database integrations order and link to quickstart * docs: link to blog post in example readme * chore: link to main readme * readme: removing example to link externally readme: removing example to link externally so we don't have to keep this example up-to-date ---------
-
- 15 Dec, 2024 1 commit
-
-
Patrick Devine authored
Refactor mllama image processing code, and add pixtral and qwen2vl
-
- 14 Dec, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Jeffrey Morgan authored
-
- 13 Dec, 2024 2 commits
-
-
Daniel Hiltgen authored
This puts the low-level runner logging back on stderr for consistency with prior releases
-
Anuraag (Rag) Agrawal authored
* openai: return usage as final chunk for streams --------- Co-authored-by:ParthSareen <parth.sareen@ollama.com>
-
- 12 Dec, 2024 2 commits
-
-
Pascal Patry authored
-
Parth Sareen authored
-
- 11 Dec, 2024 10 commits
-
-
Blake Mizerany authored
Fixes #7944
-
Daniel Hiltgen authored
Pass through the version override so the makefiles use it
-
Blake Mizerany authored
Previously we decoded and re-encoded JSON schemas during validation, which served no purpose since json.RawMessage already validates JSON syntax. Worse, the re-encoding lost field ordering from the original schema, which affects inference quality during step-by-step reasoning. While fixing this ordering issue by using json.RawMessage directly, testing revealed that schema_to_grammar (from llama.cpp) also fails to preserve field order during grammar generation. This appears to be the root cause of inference degradation. This change prevents us from mangling the user's original schema order, but we still need to address the ordering issue in schema_to_grammar. That will be a separate change. Updates #7978
-
Daniel Hiltgen authored
upload-artifacts strips off leading common paths so when the ./build/ artifacts were removed, the ./dist/windows-amd64 prefix became common and was stripped, making the later download-artifacts place them in the wrong location
-
Daniel Hiltgen authored
The new build embeds the arm runner in the main binary, so there is no longer a lib/ollama
-
Daniel Hiltgen authored
Remove no longer relevant build log dir
-
Jeffrey Morgan authored
-
Blake Mizerany authored
-
湛露先生 authored
Signed-off-by:zhanluxianshen <zhanluxianshen@163.com>
-
Phil Wornath authored
-