Commits · 4151ef8cf7d2f2c2dc6bd5fab77b5a45a388be29 · OpenDAS / ollama

22 Jul, 2025 1 commit
- Update linux.md (#11462) · 4151ef8c
  ycomiti authored Jul 22, 2025
  
  4151ef8c
17 Jul, 2025 1 commit
- docs: add the no-Modelfile function of `ollama create` (#9077) · 802ad16c
  frob authored Jul 17, 2025
  
  802ad16c
16 Jul, 2025 1 commit
- docs: fix typo in macos.md (#11425) · 2e3fd86d
  Marcelo Fornet authored Jul 16, 2025
  
  2e3fd86d
11 Jul, 2025 1 commit
- docs: update modelfile.md to reflect current default num_ctx (#11189) · 4261a3b0
  先知 authored Jul 11, 2025
```
As in the commit 44b466ee, the default context length has been increased to 4096.
```
  4261a3b0
08 Jul, 2025 2 commits

doc: add MacOS docs (#11334) · 66fb8575
Daniel Hiltgen authored Jul 08, 2025
```
also removes stale model dir instructions for windows
```
66fb8575

Reduce default parallelism to 1 (#11330) · 20c3266e

Daniel Hiltgen authored Jul 08, 2025

The current scheduler algorithm of picking the paralellism based on available
VRAM complicates the upcoming dynamic layer memory allocation algorithm. This
changes the default to 1, with the intent going forward that parallelism is
explicit and will no longer be dynamically determined. Removal of the dynamic
logic will come in a follow up.

20c3266e

07 Jul, 2025 2 commits
- add `tool_name` to api.md (#11326) · 43107b15
  Parth Sareen authored Jul 07, 2025
  
  43107b15
- template: add tool result compatibility (#11294) · 1f91cb0c
  Parth Sareen authored Jul 07, 2025
  
  1f91cb0c
05 Jul, 2025 1 commit
- doc: add NVIDIA blackwell to supported list (#11307) · 9d60bb44
  Daniel Hiltgen authored Jul 05, 2025
  
  9d60bb44
23 Jun, 2025 1 commit

Re-remove cuda v11 (#10694) · 1c6669e6

Daniel Hiltgen authored Jun 23, 2025

* Re-remove cuda v11

Revert the revert - drop v11 support requiring drivers newer than Feb 23

This reverts commit c6bcdc42.

* Simplify layout

With only one version of the GPU libraries, we can simplify things down somewhat.  (Jetsons still require special handling)

* distinct sbsa variant for linux arm64

This avoids accidentally trying to load the sbsa cuda libraries on
a jetson system which results in crashes.

* temporary prevent rocm+cuda mixed loading

1c6669e6

18 Jun, 2025 1 commit
- benchmark: remove unused benchmark test (#11120) · 8bcb3125
  Jeffrey Morgan authored Jun 18, 2025
```
Removes a test under benchmark/ that is unused
```
  8bcb3125
07 Jun, 2025 2 commits
- docs: update link to AMD drivers in linux.md (#10973) · fc030961
  Krzysztof Jeziorny authored Jun 07, 2025
  
  fc030961
- Revert "server: add model capabilities to the list endpoint (#10174)" (#11004) · 09d308d6
  Jeffrey Morgan authored Jun 06, 2025
```
This reverts commit 09430011.
```
  09d308d6
06 Jun, 2025 1 commit
- docs: fix typo in development.md (#10998) · c6a6d729
  Hunter Wittenborn authored Jun 06, 2025
  
  c6a6d729
04 Jun, 2025 1 commit
- server: add model capabilities to the list endpoint (#10174) · 09430011
  JasonHonKL authored Jun 05, 2025
  
  09430011
29 May, 2025 1 commit

add thinking support to the api and cli (#10584) · 5f57b0ef

Devon Rifkin authored May 28, 2025

- Both `/api/generate` and `/api/chat` now accept a `"think"`
  option that allows specifying whether thinking mode should be on or
  not
- Templates get passed this new option so, e.g., qwen3's template can
  put `/think` or `/no_think` in the system prompt depending on the
  value of the setting
- Models' thinking support is inferred by inspecting model templates.
  The prefix and suffix the parser uses to identify thinking support is
  also automatically inferred from templates
- Thinking control & parsing is opt-in via the API to prevent breaking
  existing API consumers. If the `"think"` option is not specified, the
  behavior is unchanged from previous versions of ollama
- Add parsing for thinking blocks in both streaming/non-streaming mode
  in both `/generate` and `/chat`
- Update the CLI to make use of these changes. Users can pass `--think`
  or `--think=false` to control thinking, or during an interactive
  session they can use the commands `/set think` or `/set nothink`
- A `--hidethinking` option has also been added to the CLI. This makes
  it easy to use thinking in scripting scenarios like
  `ollama run qwen3 --think --hidethinking "my question here"` where you
  just want to see the answer but still want the benefits of thinking
  models

5f57b0ef

24 May, 2025 1 commit
- docs: remove unsupported quantizations (#10842) · 66238981
  frob authored May 24, 2025
  
  66238981
13 May, 2025 1 commit

Revert "remove cuda v11 (#10569)" (#10692) · c6bcdc42

Daniel Hiltgen authored May 13, 2025

Bring back v11 until we can better warn users that their driver
is too old.

This reverts commit fa393554.

c6bcdc42

12 May, 2025 1 commit

Follow up to #10363 (#10647) · 9d6df908

Daniel Hiltgen authored May 12, 2025

The quantization PR didn't block all unsupported file types,
which this PR fixes.  It also updates the API docs to reflect
the now reduced set of supported types.

9d6df908

08 May, 2025 1 commit
- api: remove unused sampling parameters (#10581) · fa9973cd
  Jeffrey Morgan authored May 08, 2025
  
  fa9973cd
07 May, 2025 1 commit

remove cuda v11 (#10569) · fa393554

Daniel Hiltgen authored May 06, 2025

This reduces the size of our Windows installer payloads by ~256M by dropping
support for nvidia drivers older than Feb 2023. Hardware support is unchanged.

Linux default bundle sizes are reduced by ~600M to 1G.

fa393554

05 May, 2025 1 commit

api: remove unused or unsupported api options (#10574) · 3b2d2c83

Jeffrey Morgan authored May 05, 2025

Some options listed in api/types.go are not supported in
newer models, or have been deprecated in the past. This is
the first of a series of PRs to clean up the API options

3b2d2c83

29 Apr, 2025 1 commit
- config: update default context length to 4096 · 44b466ee
  Devon Rifkin authored Apr 28, 2025
  
  44b466ee
28 Apr, 2025 1 commit
- Revert "increase default context length to 4096 (#10364)" · dd93e1af
  Devon Rifkin authored Apr 28, 2025
```
This reverts commit 424f6486.
```
  dd93e1af
22 Apr, 2025 1 commit

increase default context length to 4096 (#10364) · 424f6486

Devon Rifkin authored Apr 22, 2025

* increase default context length to 4096

We lower the default numParallel from 4 to 2 and use these "savings" to
double the default context length from 2048 to 4096.

We're memory neutral in cases when we previously would've used
numParallel == 4, but we add the following mitigation to handle some
cases where we would have previously fallen back to 1x2048 due to low
VRAM: we decide between 2048 and 4096 using a runtime check, choosing
2048 if we're on a one GPU system with total VRAM of <= 4 GB. We
purposefully don't check the available VRAM because we don't want the
context window size to change unexpectedly based on the available VRAM.

We plan on making the default even larger, but this is a relatively
low-risk change we can make to quickly double it.

* fix tests

add an explicit context length so they don't get truncated. The code
that converts -1 from being a signal for doing a runtime check isn't
running as part of these tests.

* tweak small gpu message

* clarify context length default

also make it actually show up in `ollama serve --help`

424f6486

15 Apr, 2025 2 commits

docs: change more template blocks to have syntax highlighting · 637fd212

Devon Rifkin authored Apr 15, 2025

In #8215 syntax highlighting was added to most of the blocks, but there were a couple that were still being rendered as plaintext

637fd212

docs: update some response code blocks to json5 · 378d3210

Devon Rifkin authored Apr 14, 2025

This is to prevent rendering bright red comments indicating invalid JSON when the comments are just supposed to be explanatory

378d3210

08 Apr, 2025 1 commit

cleanup: remove OLLAMA_TMPDIR and references to temporary executables (#10182) · ccc8c677

frob authored Apr 09, 2025



* cleanup: remove OLLAMA_TMPDIR
* cleanup: ollama doesn't use temporary executables anymore

---------
Co-authored-by: Richard Lyons <frob@cloudstaff.com>

ccc8c677

01 Apr, 2025 1 commit

api: return model capabilities from the show endpoint (#10066) · e172f095

Bruce MacDonald authored Apr 01, 2025

With support for multimodal models becoming more varied and common it is important for clients to be able to easily see what capabilities a model has. Retuning these from the show endpoint will allow clients to easily see what a model can do.

e172f095

27 Mar, 2025 1 commit
- docs: make context length faq readable (#10006) · b816ff86
  Parth Sareen authored Mar 26, 2025
  
  b816ff86
25 Mar, 2025 1 commit
- docs: add flags to example linux log output command (#9852) · 5e0b904e
  copeland3300 authored Mar 25, 2025
  
  5e0b904e
21 Mar, 2025 2 commits
- benchmark: performance of running ollama server (#8643) · fb6252d7
  Bruce MacDonald authored Mar 21, 2025
  
  fb6252d7
- docs: update final response for /api/chat stream (#9919) · d14ce75b
  Parth Sareen authored Mar 21, 2025
  
  d14ce75b
13 Mar, 2025 1 commit
- docs: Add OLLAMA_ORIGINS for browser extension support (#9643) · 74b44fdf
  Bradley Erickson authored Mar 13, 2025
  
  74b44fdf
10 Mar, 2025 1 commit
- docs: Add OLLAMA_CONTEXT_LENGTH to FAQ. (#9545) · d8a5d96b
  frob authored Mar 10, 2025
  
  d8a5d96b
07 Mar, 2025 1 commit

Better WantedBy declaration · 25248f4b

‮rekcäH nitraM‮ authored Mar 07, 2025

The problem with default.target is that it always points to the target that is currently started. So if you boot into single user mode or the rescue mode still Ollama tries to start.

I noticed this because either tried (and failed) to start all the time during a system update, where Ollama definitely is not wanted.

25248f4b

05 Mar, 2025 1 commit

Win: doc new rocm zip file (#9367) · cae5d4d4

Daniel Hiltgen authored Mar 05, 2025

To stay under the 2G github artifact limit, we're splitting ROCm
out like we do on linux.

cae5d4d4

04 Mar, 2025 1 commit

server/.../backoff,syncs: don't break builds without synctest (#9484) · 55ab9f37

Blake Mizerany authored Mar 03, 2025

Previously, developers without the synctest experiment enabled would see
build failures when running tests in some server/internal/internal
packages using the synctest package. This change makes the transition to
use of the package less painful but guards the use of the synctest
package with build tags.

synctest is enabled in CI. If a new change will break a synctest
package, it will break in CI, even if it does not break locally.

The developer docs have been updated to help with any confusion about
why package tests pass locally but fail in CI.

55ab9f37

27 Feb, 2025 1 commit

Windows ARM build (#9120) · 688925ac

Daniel Hiltgen authored Feb 27, 2025

* Windows ARM build

Skip cmake, and note it's unused in the developer docs.

* Win: only check for ninja when we need it

On windows ARM, the cim lookup fails, but we don't need ninja anyway.

688925ac

25 Feb, 2025 1 commit
- docs: rocm install link (#9346) · 88885567
  Chuanhui Liu authored Feb 25, 2025
  
  88885567