Commits · ec8bf5e6c57a0a2a78c41cea4e9619b384f2ea39 · OpenDAS / ollama

15 Aug, 2025 1 commit
- server: add debug option for printing out prompt instead of calling model · 8de1da47
  Devon Rifkin authored Aug 15, 2025
  
  8de1da47
12 Aug, 2025 1 commit
- fix(openai): handle reasoning_effort (#11868) · d0cf6c82
  Michael Yang authored Aug 12, 2025
  
  d0cf6c82
05 Aug, 2025 2 commits

Devon Rifkin authored Aug 05, 2025

afaik gpt-oss is the first model that meaningfully transforms tool
function definitions in its template. We found that relatively common
definitions that include `anyOf` were not working because the template
was assuming that types were always defined via a `type` field.

anyOf allows for fully recursive types, so I exposed a
`toTypeScriptType()` function to handle this recursive logic in go and
keep the templates cleaner. The gpt-oss templates will need to be
updated to use this.

We should keep building out our function definition support to more
fully support the parts of json schema that make sense for this use
case, but in the meantime this will unblock some users (e.g., zed's
ollama integration w/ gpt-oss). Probably the most urgent is proper array
support

30f8a68c

gpt-oss (#11672) · fa7776fd

Michael Yang authored Aug 05, 2025



* bf16

* tests

* gpt-oss

* enable gptoss for engine

* rough estimate

* convert to mxfp4

* handle safetensors U8

* clamp glu/linear

* update tokenizer

* MXFP4 support

This implements the Open Compute Microscaling (MX) FP4 format
as a tensor type with backend implementations focusing
on mulmat and mulmatid on CPU, CUDA, and Metal.

* Unit tests for MXFP4 support

This exercises various operations and shapes on both CPU and GPU (if detected
on the system)

* cuda graph

* unit test adjustments

* cuda: optimize memory access

Read 4 bytes at a time (8 elements) when performing mul_mat_vec_mxfp4

* mac: fix crash on old macos versions

cblas_sgemm is only supported on v13.3 and up, however bf16 is
only supported on v14+ so we were falling back to ggml-blas and
crashing on bf16 tensors.  Checking for the function being null
seems to be the simplest way to condittionally avoid registering the
backend.

* server: Minimum context length for gptoss

This model requires a minimum context length of 8192 to function
effectively. Users can set higher values through all normal mechanisms
but lower values will be silently reset.

* ggml: Multiply by numParallel for gptoss sliding window

When computing the graph size estimate, the context size is already
multiplied by numParallel so estimates reflect that. However, since
sliding window models use a smaller, fixed context size, they need
to manually take numParallel into account.

* gpt-oss integration

includes harmony parser and thinking levels, etc.

* fix sync

* fix tests

* fix lint

---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
Co-authored-by: Jesse Gross <jesse@ollama.com>
Co-authored-by: Devon Rifkin <drifkin@drifkin.net>

fa7776fd

16 Jul, 2025 1 commit

api: fix unreachable status err (#11423) · 92c2e8a5

Bruce MacDonald authored Jul 16, 2025

StatusError was unreachable, the client always checked for error messages in the response body first, and the server always includes error messages with HTTP error status codes.

92c2e8a5

08 Jul, 2025 1 commit

API/CLI context enhancements (#11331) · 34088dbc

Daniel Hiltgen authored Jul 08, 2025

* API: expose context size of loaded models

* CLI: add context UX

This adds a column in the ps output to show the models context size.

34088dbc

07 Jul, 2025 1 commit
- template: add tool result compatibility (#11294) · 1f91cb0c
  Parth Sareen authored Jul 07, 2025
  
  1f91cb0c
07 Jun, 2025 1 commit
- Revert "server: add model capabilities to the list endpoint (#10174)" (#11004) · 09d308d6
  Jeffrey Morgan authored Jun 06, 2025
```
This reverts commit 09430011.
```
  09d308d6
04 Jun, 2025 1 commit
- server: add model capabilities to the list endpoint (#10174) · 09430011
  JasonHonKL authored Jun 05, 2025
  
  09430011
29 May, 2025 1 commit

add thinking support to the api and cli (#10584) · 5f57b0ef

Devon Rifkin authored May 28, 2025

- Both `/api/generate` and `/api/chat` now accept a `"think"`
  option that allows specifying whether thinking mode should be on or
  not
- Templates get passed this new option so, e.g., qwen3's template can
  put `/think` or `/no_think` in the system prompt depending on the
  value of the setting
- Models' thinking support is inferred by inspecting model templates.
  The prefix and suffix the parser uses to identify thinking support is
  also automatically inferred from templates
- Thinking control & parsing is opt-in via the API to prevent breaking
  existing API consumers. If the `"think"` option is not specified, the
  behavior is unchanged from previous versions of ollama
- Add parsing for thinking blocks in both streaming/non-streaming mode
  in both `/generate` and `/chat`
- Update the CLI to make use of these changes. Users can pass `--think`
  or `--think=false` to control thinking, or during an interactive
  session they can use the commands `/set think` or `/set nothink`
- A `--hidethinking` option has also been added to the CLI. This makes
  it easy to use thinking in scripting scenarios like
  `ollama run qwen3 --think --hidethinking "my question here"` where you
  just want to see the answer but still want the benefits of thinking
  models

5f57b0ef

27 May, 2025 1 commit
- client: add request signing to the client (#10881) · aa25aff1
  Patrick Devine authored May 27, 2025
```
If OLLAMA_AUTH is set, sign each request w/ a timestamp and pass the signature in the token header
```
  aa25aff1
08 May, 2025 2 commits
- lint: enable usetesting, disable tenv (#10594) · 6e9a7a25
  Michael Yang authored May 08, 2025
  
  6e9a7a25
- api: remove unused sampling parameters (#10581) · fa9973cd
  Jeffrey Morgan authored May 08, 2025
  
  fa9973cd
07 May, 2025 1 commit
- api: remove unused RetrieveModelResponse type (#10603) · 392de840
  Jeffrey Morgan authored May 06, 2025
  
  392de840
05 May, 2025 1 commit

api: remove unused or unsupported api options (#10574) · 3b2d2c83

Jeffrey Morgan authored May 05, 2025

Some options listed in api/types.go are not supported in
newer models, or have been deprecated in the past. This is
the first of a series of PRs to clean up the API options

3b2d2c83

24 Apr, 2025 1 commit
- api: fix ImageData struct comment to expect raw image bytes (#10386) · 40b10eee
  Adrien Duermael authored Apr 23, 2025
  
  40b10eee
10 Apr, 2025 1 commit
- types: include the 'items' and '$defs' fields to properly handle "array" types (#10091) · ef65174d
  Tom Sheffler authored Apr 09, 2025
```
---------
Co-authored-by: Parth Sareen <parth.sareen@ollama.com>
```
  ef65174d
08 Apr, 2025 1 commit
- types: add any type and validation for ToolFunction enum (#10166) · 6747099d
  Parth Sareen authored Apr 08, 2025
  
  6747099d
07 Apr, 2025 1 commit
- types: allow tool function parameters with a single type or an array of types (#9434) · 2f723ac2
  Alex Rozgo authored Apr 07, 2025
  
  2f723ac2
02 Apr, 2025 1 commit

chore(all): replace instances of interface with any (#10067) · 9876c9fa

Bruce MacDonald authored Apr 02, 2025

Both interface{} and any (which is just an alias for interface{} introduced in Go 1.18) represent the empty interface that all types satisfy.

9876c9fa

01 Apr, 2025 1 commit

api: return model capabilities from the show endpoint (#10066) · e172f095

Bruce MacDonald authored Apr 01, 2025

With support for multimodal models becoming more varied and common it is important for clients to be able to easily see what capabilities a model has. Retuning these from the show endpoint will allow clients to easily see what a model can do.

e172f095

13 Mar, 2025 1 commit

add verbose mode to the show command (#9640) · 4bed7392

Patrick Devine authored Mar 13, 2025

Add metadata and tensor information to the show command to be able to
see more information about a model. This outputs the same data as
shown on the model details page on ollama.com

4bed7392

05 Mar, 2025 1 commit

server/internal/registry: take over pulls from server package (#9485) · e2252d0f

Blake Mizerany authored Mar 05, 2025

This commit replaces the old pull implementation in the server package
with the new, faster, more robust pull implementation in the registry
package.

The new endpoint, and now the remove endpoint too, are behind the
feature gate "client2" enabled only by setting the OLLAMA_EXPERIMENT
environment variable include "client2".

Currently, the progress indication is wired to perform the same as the
previous implementation to avoid making changes to the CLI, and because
the status reports happen at the start of the download, and the end of
the write to disk, the progress indication is not as smooth as it could
be. This is a known issue and will be addressed in a future change.

This implementation may be ~0.5-1.0% slower in rare cases, depending on
network and disk speed, but is generally MUCH faster and more robust
than the its predecessor in all other cases.

e2252d0f

27 Feb, 2025 1 commit
- docs: fix api examples link (#9360) · be2ac1ed
  Steven Hartland authored Feb 27, 2025
```
Fix the examples link in the go package documentation for the API.
```
  be2ac1ed
24 Feb, 2025 1 commit
- config: allow setting context length through env var (#8938) · 314573bf
  Parth Sareen authored Feb 24, 2025
```
* envconfig: allow setting context length through env var
```
  314573bf
20 Feb, 2025 1 commit

api: document client stream behavior with a test (#8996) · 14b5a9a1

Bruce MacDonald authored Feb 20, 2025

Added unit tests to verify error handling behavior in the Client.stream and Client.do methods.
Tests cover various error scenarios including:
- Error responses with status codes >= 400
- Error messages with successful status codes
- Empty error messages
- Successful responses

14b5a9a1

07 Feb, 2025 1 commit
- docs: improve syntax highlighting in code blocks (#8854) · b901a712
  Azis Alvriyanto authored Feb 08, 2025
  
  b901a712
13 Jan, 2025 1 commit
- examples: remove codified examples (#8267) · 84a23144
  Parth Sareen authored Jan 13, 2025
  
  84a23144
08 Jan, 2025 1 commit
- llama: update vendored code to commit 46e3556 (#8308) · 1deafd82
  Jeffrey Morgan authored Jan 08, 2025
  
  1deafd82
03 Jan, 2025 1 commit

api: remove unused create fields · 29a8975c

Bruce MacDonald authored Jan 03, 2025

These fields are deprecated, but specifying them will not do anything. Removing them as the other deprecated fields will still work, but these do not, so they dont match our existing pattern.

29a8975c

01 Jan, 2025 1 commit
- Update the /api/create endpoint to use JSON (#7935) · 86a622cb
  Patrick Devine authored Dec 31, 2024
```
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
```
  86a622cb
11 Dec, 2024 1 commit
- llama: update vendored code to commit 40c6d79f (#7875) · 527cc978
  Jeffrey Morgan authored Dec 10, 2024
  
  527cc978
05 Dec, 2024 2 commits
- api: add generate endpoint for structured outputs (#7939) · c6c52627
  Parth Sareen authored Dec 04, 2024
  
  c6c52627
- api: structured outputs - chat endpoint (#7900) · 630e7dc6
  Parth Sareen authored Dec 04, 2024
```
Adds structured outputs to chat endpoint
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Hieu Nguyen <hieunguyen1053@outlook.com>
```
  630e7dc6
30 Nov, 2024 1 commit
- Enable index tracking for tools - openai api support (#7888) · 5f805118
  Parth Sareen authored Nov 29, 2024
  
  5f805118
12 Nov, 2024 1 commit
- api: fix typos in Go Doc comments (#7620) · d48c1c5a
  Evan authored Nov 11, 2024
  
  d48c1c5a
11 Nov, 2024 1 commit
- api: fix typo in python ClientFromEnvironment docs (#7604) · 76b2b723
  Evan authored Nov 10, 2024
  
  76b2b723
06 Nov, 2024 1 commit

runner.go: Remove unused arguments · a9094176

Jesse Gross authored Oct 30, 2024

Now that server.cpp is gone, we don't need to keep passing arguments
that were only ignored and only kept for compatibility.

a9094176

28 Aug, 2024 1 commit
- update deprecated warnings · 8e6da3cb
  Michael Yang authored Aug 27, 2024
  
  8e6da3cb
14 Aug, 2024 1 commit

Fix typo and improve readability (#5964) · 0a8d6ea8

longtao authored Aug 14, 2024



* Fix typo and improve readability

Summary:
* Rename updatAvailableMenuID to updateAvailableMenuID
* Replace unused cmd parameter with _ in RunServer function
* Fix typos in comments

(cherry picked from commit 5b8715f0b04773369e8eb1f9e6737995a0ab3ba7)

* Update api/client.go
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

0a8d6ea8