Commits · 76912c062aaa61652f50532fd74273e76b4979f3 · OpenDAS / ollama

06 Jan, 2026 1 commit
- x: add experimental agent loop (#13628) · 76912c06
  Parth Sareen authored Jan 05, 2026
  
  76912c06
26 Sep, 2025 1 commit

bugfix: restore the current runOptions if loading fails in the CLI (#12402) · b04e46da

Patrick Devine authored Sep 25, 2025

There are two bugs when using `/load <model>` for a model that doesn't exist, namely:
1. it will not restore the current model settings if the current model is a thinking model; and
2. it will crash is the current model is a non-thinking model

This bug fix saves the current runOptions and then restores them if the model load
doesn't happen. It also fixes the crash happening for non-thinking models.

b04e46da

05 Aug, 2025 1 commit

gpt-oss (#11672) · fa7776fd

Michael Yang authored Aug 05, 2025



* bf16

* tests

* gpt-oss

* enable gptoss for engine

* rough estimate

* convert to mxfp4

* handle safetensors U8

* clamp glu/linear

* update tokenizer

* MXFP4 support

This implements the Open Compute Microscaling (MX) FP4 format
as a tensor type with backend implementations focusing
on mulmat and mulmatid on CPU, CUDA, and Metal.

* Unit tests for MXFP4 support

This exercises various operations and shapes on both CPU and GPU (if detected
on the system)

* cuda graph

* unit test adjustments

* cuda: optimize memory access

Read 4 bytes at a time (8 elements) when performing mul_mat_vec_mxfp4

* mac: fix crash on old macos versions

cblas_sgemm is only supported on v13.3 and up, however bf16 is
only supported on v14+ so we were falling back to ggml-blas and
crashing on bf16 tensors.  Checking for the function being null
seems to be the simplest way to condittionally avoid registering the
backend.

* server: Minimum context length for gptoss

This model requires a minimum context length of 8192 to function
effectively. Users can set higher values through all normal mechanisms
but lower values will be silently reset.

* ggml: Multiply by numParallel for gptoss sliding window

When computing the graph size estimate, the context size is already
multiplied by numParallel so estimates reflect that. However, since
sliding window models use a smaller, fixed context size, they need
to manually take numParallel into account.

* gpt-oss integration

includes harmony parser and thinking levels, etc.

* fix sync

* fix tests

* fix lint

---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
Co-authored-by: Jesse Gross <jesse@ollama.com>
Co-authored-by: Devon Rifkin <drifkin@drifkin.net>

fa7776fd

22 Jul, 2025 1 commit
- Fix GetModelInfo (#11496) · 3bac5cba
  Patrick Devine authored Jul 22, 2025
```
---------
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
  3bac5cba
29 May, 2025 1 commit

add thinking support to the api and cli (#10584) · 5f57b0ef

Devon Rifkin authored May 28, 2025

- Both `/api/generate` and `/api/chat` now accept a `"think"`
  option that allows specifying whether thinking mode should be on or
  not
- Templates get passed this new option so, e.g., qwen3's template can
  put `/think` or `/no_think` in the system prompt depending on the
  value of the setting
- Models' thinking support is inferred by inspecting model templates.
  The prefix and suffix the parser uses to identify thinking support is
  also automatically inferred from templates
- Thinking control & parsing is opt-in via the API to prevent breaking
  existing API consumers. If the `"think"` option is not specified, the
  behavior is unchanged from previous versions of ollama
- Add parsing for thinking blocks in both streaming/non-streaming mode
  in both `/generate` and `/chat`
- Update the CLI to make use of these changes. Users can pass `--think`
  or `--think=false` to control thinking, or during an interactive
  session they can use the commands `/set think` or `/set nothink`
- A `--hidethinking` option has also been added to the CLI. This makes
  it easy to use thinking in scripting scenarios like
  `ollama run qwen3 --think --hidethinking "my question here"` where you
  just want to see the answer but still want the benefits of thinking
  models

5f57b0ef

13 May, 2025 1 commit
- server: add webp image input support (#10653) · c7f4ae7b
  Jeffrey Morgan authored May 12, 2025
  
  c7f4ae7b
10 May, 2025 1 commit
- cmd: strip single quotes from image page (#10636) · 3fa78598
  Bruce MacDonald authored May 09, 2025
  
  3fa78598
20 Apr, 2025 1 commit
- cmd: add support for escaping ~ in filepath (#10339) · 08065216
  greengrass821 authored Apr 21, 2025
```
Co-authored-by: tooth paste <tooth_paste91@Poorneshwars-MacBook-Pro.local>
```
  08065216
15 Mar, 2025 1 commit

fix: correctly save in interactive mode (#9788) · 2c8b4846

Patrick Devine authored Mar 15, 2025

This fixes the case where a FROM line in previous modelfile points to a
file which may/may not be present in a different ollama instance. We
shouldn't be relying on the filename though and instead just check if
the FROM line was instead a valid model name and point to that instead.

2c8b4846

13 Mar, 2025 1 commit

add verbose mode to the show command (#9640) · 4bed7392

Patrick Devine authored Mar 13, 2025

Add metadata and tensor information to the show command to be able to
see more information about a model. This outputs the same data as
shown on the model details page on ollama.com

4bed7392

12 Mar, 2025 1 commit
- cli: don't exit for invalid model during /load. (#9576) · b3af953a
  frob authored Mar 12, 2025
```
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
  b3af953a
01 Jan, 2025 1 commit
- Update the /api/create endpoint to use JSON (#7935) · 86a622cb
  Patrick Devine authored Dec 31, 2024
```
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
```
  86a622cb
22 Dec, 2024 1 commit
- fix crash bug with /save when quotes are used (#8208) · dd352ab2
  Patrick Devine authored Dec 21, 2024
  
  dd352ab2
26 Nov, 2024 1 commit
- cmd: don't submit svg files as images for now (#7830) · 30e88d7f
  frob authored Nov 26, 2024
  
  30e88d7f
21 Nov, 2024 1 commit
- cmd: delete duplicated call to sb.Reset() (#7308) · eaaf5d30
  湛露先生 authored Nov 22, 2024
```
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
```
  eaaf5d30
18 Oct, 2024 1 commit

image processing for llama3.2 (#6963) · c7cb0f06

Patrick Devine authored Oct 18, 2024


Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>

c7cb0f06

10 Oct, 2024 1 commit

cli: Send all images in conversation history · 7fe39025

Jesse Gross authored Oct 09, 2024

Currently the CLI only sends images from the most recent image-
containing message. This prevents doing things like sending
one message with an image and then a follow message with a
second image and asking for comparision based on additional
information not present in any text that was output.

It's possible that some models have a problem with this but the
CLI is not the right place to do this since any adjustments are
model-specific and should affect all clients.

Both llava:34b and minicpm-v do reasonable things with multiple
images in the history.

7fe39025

11 Sep, 2024 2 commits
- add "stop" command (#6739) · abed273d
  Patrick Devine authored Sep 11, 2024
  
  abed273d
- refactor show ouput · ecab6f1c
  Michael Yang authored Sep 11, 2024
```
fixes line wrapping on long texts
```
  ecab6f1c
02 Aug, 2024 1 commit
- lint · b732beba
  Michael Yang authored Aug 01, 2024
  
  b732beba
27 Jul, 2024 1 commit
- feat: add support for min_p (resolve #1142) (#1825) · f3d7a481
  Tibor Schmidt authored Jul 27, 2024
  
  f3d7a481
26 Jul, 2024 2 commits
- display messages · a250c2cb
  Michael Yang authored Jul 26, 2024
  
  a250c2cb
- fix: model save · 3d9de805
  Michael Yang authored Jul 26, 2024
```
stop parameter is saved as a slice which is incompatible with modelfile
parsing
```
  3d9de805
22 Jul, 2024 1 commit
- bool · 55cd3ddc
  Michael Yang authored Jul 03, 2024
  
  55cd3ddc
14 Jul, 2024 1 commit
- remove template (#5655) · 057d3186
  Patrick Devine authored Jul 13, 2024
  
  057d3186
28 Jun, 2024 1 commit
- Include Show Info in Interactive (#5342) · 5f034f5b
  royjhan authored Jun 28, 2024
  
  5f034f5b
25 Jun, 2024 1 commit

cmd: defer stating model info until necessary (#5248) · 2aa91a93

Blake Mizerany authored Jun 24, 2024

This commit changes the 'ollama run' command to defer fetching model
information until it really needs it. That is, when in interactive mode.

It also removes one such case where the model information is fetch in
duplicate, just before calling generateInteractive and then again, first
thing, in generateInteractive.

This positively impacts the performance of the command:

    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.168 total
    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.220 total
    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.217 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 4% cpu 0.652 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.01s user 0.01s system 5% cpu 0.498 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?

    ./after run llama3 'hi'  0.01s user 0.01s system 3% cpu 0.479 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total

2aa91a93

04 Jun, 2024 1 commit
- replace x/exp/slices with slices · 04f3c12b
  Michael Yang authored May 21, 2024
  
  04f3c12b
24 May, 2024 1 commit
- Move envconfig and consolidate env vars (#4608) · 4cc3be30
  Patrick Devine authored May 24, 2024
  
  4cc3be30
21 May, 2024 1 commit
- add Ctrl + W shortcut · 353f83a9
  Josh Yan authored May 21, 2024
  
  353f83a9
18 May, 2024 1 commit
- add OLLAMA_NOHISTORY to turn off history in interactive mode (#4508) · 105186aa
  Patrick Devine authored May 18, 2024
  
  105186aa
14 May, 2024 3 commits
- re-add system context (#4435) · a4b8d1f8
  Patrick Devine authored May 14, 2024
  
  a4b8d1f8
- don't abort when an invalid model name is used in /save (#4416) · 7ca71a6b
  Patrick Devine authored May 13, 2024
  
  7ca71a6b
- Ollama `ps` command for showing currently loaded models (#4327) · 68459888
  Patrick Devine authored May 13, 2024
  
  68459888
07 May, 2024 1 commit
- Fix help string for stop parameter (#2307) · 06ac829e
  Tobias Gårdhus authored May 08, 2024
  
  06ac829e
01 May, 2024 1 commit

Add a /clear command (#3947) · bf4fc25f

Bryce Reitano authored May 01, 2024



* Add a /clear command

* change help messages

---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>

bf4fc25f

23 Apr, 2024 1 commit
- Revert "stop running model on interactive exit" · 658e60cf
  Bruce MacDonald authored Apr 22, 2024
```
This reverts commit fad00a85.
```
  658e60cf
22 Apr, 2024 1 commit
- stop running model on interactive exit · fad00a85
  Bruce MacDonald authored Apr 22, 2024
  
  fad00a85
29 Mar, 2024 1 commit
- Add gemma safetensors conversion (#3250) · 5a5efee4
  Patrick Devine authored Mar 28, 2024
```
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  5a5efee4
26 Mar, 2024 1 commit
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b