- 08 Jan, 2026 2 commits
-
-
Daniel Hiltgen authored
With the upcoming addition of MLX, the linux bundle will exceed the maximum github artifact size of 2G. This change will bring the size back down. The install.sh changes support backwards compatibility for prior versions thus should be safe to merge concurrently with this change.
-
Parth Sareen authored
-
- 07 Jan, 2026 3 commits
-
-
Parth Sareen authored
-
Parth Sareen authored
-
Devon Rifkin authored
In #13525, I accidentally broke templates' ability to automatically render tool call function arguments as JSON. We do need these to be proper maps because we need templates to be able to call range, which can't be done on custom types.
-
- 06 Jan, 2026 3 commits
-
-
Parth Sareen authored
-
Devon Rifkin authored
I introduced this in <https://github.com/ollama/ollama/pull/13525>
-
Devon Rifkin authored
* preserve tool definition and call JSON ordering This is another iteration of <https://github.com/ollama/ollama/pull/12518>, but this time we've simplified things by relaxing the competing requirements of being compatible AND order-preserving with templates (vs. renderers). We maintain backwards compatibility at the cost of not guaranteeing order for templates. We plan on moving more and more models to renderers, which have been updated to use these new data types, and additionally we could add an opt-in way of templates getting an order-preserved list (e.g., via sibling template vars) * orderedmap_test: remove testify
-
- 03 Jan, 2026 5 commits
-
-
Harry V. Kiselev authored
-
lif authored
The normalize function now checks for NaN and Inf values in the embedding vector before processing. This prevents JSON encoding failures when models produce invalid floating-point values. Fixes #13572 Signed-off-by:majiayu000 <1835304752@qq.com>
-
Nhan Nguyen authored
The tool calling example used "get_temperature" for tool_calls but defined the tool as "get_weather". Also removed trailing commas that made the JSON invalid. Fixes #13031
-
lif authored
Fixes #13476 Signed-off-by:majiayu000 <1835304752@qq.com>
-
lif authored
Signed-off-by:majiayu000 <1835304752@qq.com>
-
- 23 Dec, 2025 2 commits
-
-
Vallabh Mahajan authored
-
Daniel Hiltgen authored
On Linux, look at the GTT memory information for iGPUs.
-
- 19 Dec, 2025 1 commit
-
-
Jesse Gross authored
On the llama engine, when we compute the memory layout, we reserve a buffer to allow for some flexibility for incorrect estimates. This is subtracted from GPU free memory and on GPUs with limited memory, it may underflow. Fixes #13494
-
- 18 Dec, 2025 4 commits
-
-
Jeffrey Morgan authored
-
Parth Sareen authored
-
Grace authored
-
- 17 Dec, 2025 3 commits
-
-
Daniel Hiltgen authored
* Revert "add support for NVIDIA Nemotron 3 Nano" This reverts commit e7d2ae9d69421012e9a8765c06a3fdf0e45b12f3. * GGML update to 380b4c984 Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no padding required) * update to c45f89d55 * ec98e2002 solar pro needed more adjusting - needs verification * review comments
-
Parth Sareen authored
-
Grace authored
-
- 16 Dec, 2025 8 commits
-
-
Michael Yang authored
-
Bruce MacDonald authored
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
-
Michael Yang authored
slog is already lazily evaluated so this code is completely redundant
-
Michael Yang authored
register bpe tokenizer which enables granite-embedding
-
Parth Sareen authored
-
Parth Sareen authored
--------- Co-authored-by:Daniel Hiltgen <daniel@ollama.com>
-
Grace authored
-
Michael Yang authored
* qwen25vl: bump max pixels * qwen25vl: mrope fix qwen2.5vl window * qwen25vl: vision rope
-
- 15 Dec, 2025 6 commits
-
-
Parth Sareen authored
-
Grace authored
-
Nhan Nguyen authored
The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared library SOVERSION property, but these variables were not defined when building from ollama's CMakeLists.txt. This caused libggml-base.so to be named with a literal "SOVERSION" suffix (libggml-base.so.SOVERSION) instead of the actual version number (libggml-base.so.0). The fix adds the required GGML_VERSION_* variables before including the ggml subdirectory. Fixes #13436
-
Parth Sareen authored
-
Eva H authored
-
Daniel Hiltgen authored
This reverts commit 56f754f46b87749581f73ef3625314bb0e51bfed.
-
- 13 Dec, 2025 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 12 Dec, 2025 1 commit
-
-
Daniel Hiltgen authored
* flash attn: add auto mode for llama engine If the user does not specify fa in the environment, use auto-mode. * review comments * ensure kv cache quantized types have FA explicitly enabled additional review comments
-