Commits · 33ee7168ba1e16c813b52dc2c9417efa1e2e9f20 · OpenDAS / ollama

09 Jan, 2026 1 commit

Add experimental MLX backend and engine with imagegen support (#13648) · 33ee7168

Daniel Hiltgen authored Jan 08, 2026



* WIP - MLX backend with gemma3

* MLX: add cmake and go tag build toggles

To build the new MLX backend code:
  cmake --preset MLX
  cmake --build --preset MLX --parallel
  cmake --install build --component MLX
  go build -tags mlx .

Note: the main.go entrypoint for the MLX engine will change in a follow up commit.

* add experimental image generation runtime

* add experimental image generation runtime

* MLX: wire up cuda build for linux

* MLX: get dependencies correct and dedup

This is still too large for a unified github artifact, but is now "correct" for the mlx_cuda_v13
directory.

* fix relative link bug in dedup

* Add darwin build and readme

* add go build tag for mlx dependent code and wire up build_darwin.sh

* lint cleanup

* macos: build mlx for x86

This will be CPU only.

* cuda build instructions and fix drift from mlx bump

* stale comment

* Delete agent helper doc

* Clean up readme.md

* Revise README for tokenizer clarity and details

Updated README to clarify tokenizer functionality and removed correctness section.

---------
Co-authored-by: jmorganca <jmorganca@gmail.com>

33ee7168

08 Jan, 2026 2 commits

Linux: switch to zstd compression (#13651) · 34d0c55e

Daniel Hiltgen authored Jan 08, 2026

With the upcoming addition of MLX, the linux bundle will exceed the
maximum github artifact size of 2G.  This change will bring the size
back down.

The install.sh changes support backwards compatibility for prior versions
thus should be safe to merge concurrently with this change.

34d0c55e

x: redesign agent UI with minimal styling (#13650) · 53a5a9e9
Parth Sareen authored Jan 08, 2026

53a5a9e9

07 Jan, 2026 3 commits
- x: remove Ctrl+O tool output expansion feature (#13640) · e30e08a7
  Parth Sareen authored Jan 07, 2026
  
  e30e08a7
- x: agent loop ux improvements (#13635) · 12e2b351
  Parth Sareen authored Jan 07, 2026
  
  12e2b351
- template: fix args-as-json rendering (#13636) · 626af2d8
  Devon Rifkin authored Jan 06, 2026
```
In #13525, I accidentally broke templates' ability to automatically
render tool call function arguments as JSON.

We do need these to be proper maps because we need templates to be able
to call range, which can't be done on custom types.
```
  626af2d8
06 Jan, 2026 3 commits

x: add experimental agent loop (#13628) · 76912c06
Parth Sareen authored Jan 05, 2026

76912c06
olmo3: fix flaky test (#13629) · 6c3faafe
Devon Rifkin authored Jan 05, 2026
```
I introduced this in <https://github.com/ollama/ollama/pull/13525>
```
6c3faafe

preserve tool definition and call JSON ordering (#13525) · e51dead6

Devon Rifkin authored Jan 05, 2026

* preserve tool definition and call JSON ordering

This is another iteration of
<https://github.com/ollama/ollama/pull/12518>, but this time we've
simplified things by relaxing the competing requirements of being
compatible AND order-preserving with templates (vs. renderers). We
maintain backwards compatibility at the cost of not guaranteeing order
for templates. We plan on moving more and more models to renderers,
which have been updated to use these new data types, and additionally
we could add an opt-in way of templates getting an order-preserved list
(e.g., via sibling template vars)

* orderedmap_test: remove testify

e51dead6

03 Jan, 2026 5 commits
- docs/capabilities/vision: fix curl related code snippet (#13615) · d087e46b
  Harry V. Kiselev authored Jan 04, 2026
  
  d087e46b
- server: return error when embedding contains NaN or Inf values (#13599) · 37f6f3af
  lif authored Jan 03, 2026
```
The normalize function now checks for NaN and Inf values in the
embedding vector before processing. This prevents JSON encoding
failures when models produce invalid floating-point values.

Fixes #13572
Signed-off-by: majiayu000 <1835304752@qq.com>
```
  37f6f3af
- docs: fix tool name mismatch and trailing commas in api.md example (#13559) · e1bdc23d
  Nhan Nguyen authored Jan 03, 2026
```
The tool calling example used "get_temperature" for tool_calls but
defined the tool as "get_weather". Also removed trailing commas that
made the JSON invalid.

Fixes #13031
```
  e1bdc23d
- app/ui: add swift syntax highlighting support (#13574) · 2e78653f
  lif authored Jan 03, 2026
```
Fixes #13476
Signed-off-by: majiayu000 <1835304752@qq.com>
```
  2e78653f
- docs: add version note for /v1/responses API (#13596) · f5f74e12
  lif authored Jan 03, 2026
```
Signed-off-by: majiayu000 <1835304752@qq.com>
```
  f5f74e12
23 Dec, 2025 2 commits
- docs: fix broken .md links and render issues (#13550) · 18fdcc94
  Vallabh Mahajan authored Dec 23, 2025
  
  18fdcc94
- amd: use GTT on iGPUs on linux (#13196) · 7ad03699
  Daniel Hiltgen authored Dec 23, 2025
```
On Linux, look at the GTT memory information for iGPUs.
```
  7ad03699
19 Dec, 2025 1 commit

llm: Avoid integer underflow on llama engine memory layout · 172b5924

Jesse Gross authored Dec 19, 2025

On the llama engine, when we compute the memory layout, we reserve
a buffer to allow for some flexibility for incorrect estimates.
This is subtracted from GPU free memory and on GPUs with limited
memory, it may underflow.

Fixes #13494

172b5924

18 Dec, 2025 4 commits
- add REQUIRES command to Modelfile (#13361) · 8852220f
  Jeffrey Morgan authored Dec 18, 2025
  
  8852220f
- parsers/renderers: functiongemma (#13521) · 73257915
  Parth Sareen authored Dec 18, 2025
  
  73257915
- Revert "Omit args and params in tool function def and calls (#13516)" (#13518) · 522c11a7
  Grace authored Dec 17, 2025
```
This reverts commit 0fadeffa.
```
  522c11a7
- Omit args and params in tool function def and calls (#13516) · 0fadeffa
  Grace authored Dec 17, 2025
  
  0fadeffa
17 Dec, 2025 3 commits

GGML update to ec98e2002 (#13451) · 49a9c9ba

Daniel Hiltgen authored Dec 17, 2025

* Revert "add support for NVIDIA Nemotron 3 Nano"

This reverts commit e7d2ae9d69421012e9a8765c06a3fdf0e45b12f3.

* GGML update to 380b4c984

Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no
padding required)

* update to c45f89d55

* ec98e2002

solar pro needed more adjusting - needs verification

* review comments

49a9c9ba

types: add nested property support for tool definitions (#13508) · 1c094038
Parth Sareen authored Dec 17, 2025

1c094038
DeepseekV3 Family Parser (#13484) · a013693f
Grace authored Dec 16, 2025

a013693f

16 Dec, 2025 8 commits
- revert granite-embedding (#13505) · f6a016f4
  Michael Yang authored Dec 16, 2025
  
  f6a016f4
- types: ConfigV2 and RootFS (#13504) · 45c47393
  Bruce MacDonald authored Dec 16, 2025
```
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
```
  45c47393
- remove unnecessary code (#13502) · 2dd029de
  Michael Yang authored Dec 16, 2025
```
slog is already lazily evaluated so this code is completely redundant
```
  2dd029de
- use ollama engine for bert models (#13501) · 903b1fc9
  Michael Yang authored Dec 16, 2025
```
register bpe tokenizer which enables granite-embedding
```
  903b1fc9
- parsers/renderers: use think from user for nemotron (#13492) · 89eb7952
  Parth Sareen authored Dec 15, 2025
  
  89eb7952
- llama/parsers/renderers: nemotron 3 nano (#13489) · 7e3ea813
  Parth Sareen authored Dec 15, 2025
```
---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
```
  7e3ea813
- Adding tool definitions to DeepseekV3 renderer (#13491) · 7b95087b
  Grace authored Dec 15, 2025
  
  7b95087b
- fix: qwen2.5 vl rope (#13486) · 971d6259
  Michael Yang authored Dec 15, 2025
```
* qwen25vl: bump max pixels

* qwen25vl: mrope

fix qwen2.5vl window

* qwen25vl: vision rope
```
  971d6259
15 Dec, 2025 6 commits
- model: add olmo3 and olmo3.1 (#13415) · ffbe8e07
  Parth Sareen authored Dec 15, 2025
  
  ffbe8e07
- DeepseekV3 family renderer (#13180) · 2c639431
  Grace authored Dec 15, 2025
  
  2c639431
- fix: define GGML_VERSION variables for proper SOVERSION expansion (#13469) · aacd1cb3
  Nhan Nguyen authored Dec 15, 2025
```
The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared
library SOVERSION property, but these variables were not defined when
building from ollama's CMakeLists.txt.

This caused libggml-base.so to be named with a literal "SOVERSION"
suffix (libggml-base.so.SOVERSION) instead of the actual version
number (libggml-base.so.0).

The fix adds the required GGML_VERSION_* variables before including
the ggml subdirectory.

Fixes #13436
```
  aacd1cb3
- renderers: add olmo3.1 and olmo3 fixes (#13447) · e3731fb1
  Parth Sareen authored Dec 15, 2025
  
  e3731fb1
- app/ui: handle unspecified bind addresses and wait for server in ollama proxy (#13159) · 8dbc9e7b
  Eva H authored Dec 15, 2025
  
  8dbc9e7b
- Revert "Enable Ollama engine by default" (#13481) · abe67acf
  Daniel Hiltgen authored Dec 15, 2025
```
This reverts commit 56f754f46b87749581f73ef3625314bb0e51bfed.
```
  abe67acf
13 Dec, 2025 2 commits
- model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) · 4ff8a691
  Jeffrey Morgan authored Dec 12, 2025
  
  4ff8a691
- model: fix global layer rope scale values for gemma 3 (#13452) · 1b308e1d
  Jeffrey Morgan authored Dec 12, 2025
  
  1b308e1d