Commits · 2cb0a580f34377c8b014ce20d8fdb370cfc2a12e · OpenDAS / ollama

22 Aug, 2025 1 commit

thinking: fix double emit when no opening tag · 2cb0a580

Devon Rifkin authored Aug 21, 2025

The thinking parser will automatically transition to being a
pass-through if non-whitespace is seen before an opening tag. However,
we weren't clearing the buffer after the first non-whitespace input, so
in practice the first token would be emitted twice.

Added a test that demonstrated this, and then fixed the bug.

2cb0a580

06 Jun, 2025 1 commit
- move thinking logic into its own package (#10990) · a3b6886b
  Devon Rifkin authored Jun 06, 2025
```
move thinking logic into its own package
```
  a3b6886b
05 Jun, 2025 1 commit
- export ThinkingParser · 0683efa6
  Devon Rifkin authored Jun 05, 2025
  
  0683efa6
29 May, 2025 1 commit

add thinking support to the api and cli (#10584) · 5f57b0ef

Devon Rifkin authored May 28, 2025

- Both `/api/generate` and `/api/chat` now accept a `"think"`
  option that allows specifying whether thinking mode should be on or
  not
- Templates get passed this new option so, e.g., qwen3's template can
  put `/think` or `/no_think` in the system prompt depending on the
  value of the setting
- Models' thinking support is inferred by inspecting model templates.
  The prefix and suffix the parser uses to identify thinking support is
  also automatically inferred from templates
- Thinking control & parsing is opt-in via the API to prevent breaking
  existing API consumers. If the `"think"` option is not specified, the
  behavior is unchanged from previous versions of ollama
- Add parsing for thinking blocks in both streaming/non-streaming mode
  in both `/generate` and `/chat`
- Update the CLI to make use of these changes. Users can pass `--think`
  or `--think=false` to control thinking, or during an interactive
  session they can use the commands `/se...

5f57b0ef