Commits · 2e78653ff99536885ab6bbbc00f20dd34d499102 · OpenDAS / ollama

03 Jan, 2026 2 commits
- app/ui: add swift syntax highlighting support (#13574) · 2e78653f
  lif authored Jan 03, 2026
```
Fixes #13476
Signed-off-by: majiayu000 <1835304752@qq.com>
```
  2e78653f
- docs: add version note for /v1/responses API (#13596) · f5f74e12
  lif authored Jan 03, 2026
```
Signed-off-by: majiayu000 <1835304752@qq.com>
```
  f5f74e12
23 Dec, 2025 2 commits
- docs: fix broken .md links and render issues (#13550) · 18fdcc94
  Vallabh Mahajan authored Dec 23, 2025
  
  18fdcc94
- amd: use GTT on iGPUs on linux (#13196) · 7ad03699
  Daniel Hiltgen authored Dec 23, 2025
```
On Linux, look at the GTT memory information for iGPUs.
```
  7ad03699
19 Dec, 2025 1 commit

llm: Avoid integer underflow on llama engine memory layout · 172b5924

Jesse Gross authored Dec 19, 2025

On the llama engine, when we compute the memory layout, we reserve
a buffer to allow for some flexibility for incorrect estimates.
This is subtracted from GPU free memory and on GPUs with limited
memory, it may underflow.

Fixes #13494

172b5924

18 Dec, 2025 4 commits
- add REQUIRES command to Modelfile (#13361) · 8852220f
  Jeffrey Morgan authored Dec 18, 2025
  
  8852220f
- parsers/renderers: functiongemma (#13521) · 73257915
  Parth Sareen authored Dec 18, 2025
  
  73257915
- Revert "Omit args and params in tool function def and calls (#13516)" (#13518) · 522c11a7
  Grace authored Dec 17, 2025
```
This reverts commit 0fadeffa.
```
  522c11a7
- Omit args and params in tool function def and calls (#13516) · 0fadeffa
  Grace authored Dec 17, 2025
  
  0fadeffa
17 Dec, 2025 3 commits

GGML update to ec98e2002 (#13451) · 49a9c9ba

Daniel Hiltgen authored Dec 17, 2025

* Revert "add support for NVIDIA Nemotron 3 Nano"

This reverts commit e7d2ae9d69421012e9a8765c06a3fdf0e45b12f3.

* GGML update to 380b4c984

Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no
padding required)

* update to c45f89d55

* ec98e2002

solar pro needed more adjusting - needs verification

* review comments

49a9c9ba

types: add nested property support for tool definitions (#13508) · 1c094038
Parth Sareen authored Dec 17, 2025

1c094038
DeepseekV3 Family Parser (#13484) · a013693f
Grace authored Dec 16, 2025

a013693f

16 Dec, 2025 8 commits
- revert granite-embedding (#13505) · f6a016f4
  Michael Yang authored Dec 16, 2025
  
  f6a016f4
- types: ConfigV2 and RootFS (#13504) · 45c47393
  Bruce MacDonald authored Dec 16, 2025
```
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
```
  45c47393
- remove unnecessary code (#13502) · 2dd029de
  Michael Yang authored Dec 16, 2025
```
slog is already lazily evaluated so this code is completely redundant
```
  2dd029de
- use ollama engine for bert models (#13501) · 903b1fc9
  Michael Yang authored Dec 16, 2025
```
register bpe tokenizer which enables granite-embedding
```
  903b1fc9
- parsers/renderers: use think from user for nemotron (#13492) · 89eb7952
  Parth Sareen authored Dec 15, 2025
  
  89eb7952
- llama/parsers/renderers: nemotron 3 nano (#13489) · 7e3ea813
  Parth Sareen authored Dec 15, 2025
```
---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
```
  7e3ea813
- Adding tool definitions to DeepseekV3 renderer (#13491) · 7b95087b
  Grace authored Dec 15, 2025
  
  7b95087b
- fix: qwen2.5 vl rope (#13486) · 971d6259
  Michael Yang authored Dec 15, 2025
```
* qwen25vl: bump max pixels

* qwen25vl: mrope

fix qwen2.5vl window

* qwen25vl: vision rope
```
  971d6259
15 Dec, 2025 6 commits
- model: add olmo3 and olmo3.1 (#13415) · ffbe8e07
  Parth Sareen authored Dec 15, 2025
  
  ffbe8e07
- DeepseekV3 family renderer (#13180) · 2c639431
  Grace authored Dec 15, 2025
  
  2c639431
- fix: define GGML_VERSION variables for proper SOVERSION expansion (#13469) · aacd1cb3
  Nhan Nguyen authored Dec 15, 2025
```
The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared
library SOVERSION property, but these variables were not defined when
building from ollama's CMakeLists.txt.

This caused libggml-base.so to be named with a literal "SOVERSION"
suffix (libggml-base.so.SOVERSION) instead of the actual version
number (libggml-base.so.0).

The fix adds the required GGML_VERSION_* variables before including
the ggml subdirectory.

Fixes #13436
```
  aacd1cb3
- renderers: add olmo3.1 and olmo3 fixes (#13447) · e3731fb1
  Parth Sareen authored Dec 15, 2025
  
  e3731fb1
- app/ui: handle unspecified bind addresses and wait for server in ollama proxy (#13159) · 8dbc9e7b
  Eva H authored Dec 15, 2025
  
  8dbc9e7b
- Revert "Enable Ollama engine by default" (#13481) · abe67acf
  Daniel Hiltgen authored Dec 15, 2025
```
This reverts commit 56f754f46b87749581f73ef3625314bb0e51bfed.
```
  abe67acf
13 Dec, 2025 2 commits
- model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) · 4ff8a691
  Jeffrey Morgan authored Dec 12, 2025
  
  4ff8a691
- model: fix global layer rope scale values for gemma 3 (#13452) · 1b308e1d
  Jeffrey Morgan authored Dec 12, 2025
  
  1b308e1d
12 Dec, 2025 10 commits

flash attn: add auto mode for llama engine (#13052) · bd6c1d6b

Daniel Hiltgen authored Dec 12, 2025

* flash attn: add auto mode for llama engine

If the user does not specify fa in the environment, use auto-mode.

* review comments

* ensure kv cache quantized types have FA explicitly enabled

additional review comments

bd6c1d6b

model: force rope factor 1.0 for Gemma 3 (#13445) · 3af5d3b7
Jeffrey Morgan authored Dec 12, 2025

3af5d3b7

Enable Ollama engine by default (#13443) · 77308951

Daniel Hiltgen authored Dec 12, 2025

This changes the default behavior to use the Ollama engine for supported
models, while retaining the ability to disable the Ollama engine and
fall back to the Llama engine. Models in the OllamaEngineRequired list
will always run on the Ollama engine.

77308951

tidy up lint warnings on windows (#13430) · de9ecfd0
Eva H authored Dec 12, 2025

de9ecfd0
fix: select and update models folder in settings (#13412) · 95fdd8d6
Eva H authored Dec 12, 2025

95fdd8d6

docs: add docs for v1/responses and rework openai compat section (#13416) · 9f782285

Devon Rifkin authored Dec 11, 2025



* docs: add docs for v1/responses and rework openai compat section

I reworked the examples to be separated by topic and to be fully
runnable (i.e., they now log output instead of just suggesting how a
call might be made).

We now use `<CodeGroup>`s so that each example has a dropdown on the
docs site for users to choose, which makes the examples a lot more
digestible (since you only see approx 1/3 of the code you used to).

I also added a new tool to extract code examples into files so that it's
easier to actually run them and check that they work.

## Example

```shell
go run docs/tools/extract-examples/main.go docs/api/openai-compatibility.mdx
```

Output:

```
Extracting code examples to: /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368

  - 01_basic.py
  - 01_basic.js
  - 01_basic.sh
  - 02_responses.py
  - 02_responses.js
  - 02_responses.sh
  - 03_vision.py
  - 03_vision.js
  - 03_vision.sh

Extracted 9 file(s) to /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368

To run examples:

  cd /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368
  npm install   # for JS examples

then run individual files with `node file.js`, `python file.py`, `bash file.sh`
```

In the future we should consider actually running the examples in CI and
having some sort of acceptance test so we can automatically detect when
our examples break. So this is just a start in that direction.

* Update docs/api/openai-compatibility.mdx
Co-authored-by: Parth Sareen <parth.sareen@ollama.com>

* Update docs/api/openai-compatibility.mdx
Co-authored-by: Parth Sareen <parth.sareen@ollama.com>

---------
Co-authored-by: Parth Sareen <parth.sareen@ollama.com>

9f782285

openai: add tool call appending to previous assistant message (#13434) · 9b2035d1
Parth Sareen authored Dec 11, 2025
```
* openai: add tool call appending to previous asst message

* add tests for thinking appending
```
9b2035d1
docs: fix link to modelfile.mdx (#13220) · 93d45d7a
Alexander Gusak authored Dec 12, 2025

93d45d7a

Update README.md (#13373) · 709f8424

JJ authored Dec 11, 2025

Correct Markdown syntax for Swollama GitHub and DocC documentation links

709f8424

model: fix rotary embeddings for ministral 3 (#13432) · 2dfb7441
Jeffrey Morgan authored Dec 11, 2025

2dfb7441

11 Dec, 2025 2 commits

openai: add v1/responses support (#13351) · 1eb5e759

Devon Rifkin authored Dec 11, 2025

Only supporting the stateless part of the API.

Doc updates to come once this is shipped.

Closes: #9659

1eb5e759

embeddings: modified batch size (#13429) · 3475d915

nicole pardal authored Dec 11, 2025

This PR detects embedding models and sets batch_size = context_size so the full input fits in a single batch.
Previously, if batch size was smaller than the input, tokens could be split across batches and cause a SIGTRAP crash.
This change ensures all tokens stay in one batch and prevents crashes.
Fixes: #12938 #13054
Co-authored-by: Jesse Gross <jesse@ollama.com>

3475d915