- 07 Jul, 2025 1 commit
-
-
Parth Sareen authored
-
- 29 May, 2025 1 commit
-
-
Devon Rifkin authored
- Both `/api/generate` and `/api/chat` now accept a `"think"` option that allows specifying whether thinking mode should be on or not - Templates get passed this new option so, e.g., qwen3's template can put `/think` or `/no_think` in the system prompt depending on the value of the setting - Models' thinking support is inferred by inspecting model templates. The prefix and suffix the parser uses to identify thinking support is also automatically inferred from templates - Thinking control & parsing is opt-in via the API to prevent breaking existing API consumers. If the `"think"` option is not specified, the behavior is unchanged from previous versions of ollama - Add parsing for thinking blocks in both streaming/non-streaming mode in both `/generate` and `/chat` - Update the CLI to make use of these changes. Users can pass `--think` or `--think=false` to control thinking, or during an interactive session they can use the commands `/set think` or `/set nothink` - A `--hidethinking` option has also been added to the CLI. This makes it easy to use thinking in scripting scenarios like `ollama run qwen3 --think --hidethinking "my question here"` where you just want to see the answer but still want the benefits of thinking models
-
- 20 Mar, 2025 1 commit
-
-
Patrick Devine authored
This change allows the gemma3 template to be autodetected during `ollama create`.
-
- 14 Feb, 2025 1 commit
-
-
Michael Yang authored
feat: add new Ollama engine using ggml through cgo This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this. - `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go` - `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go` - `ml.Tensor` defines the interface for a tensor and tensor operations This is the first implementation of the new engine. Follow up PRs will implement more features: - non-greedy sampling (#8410) - integration with Ollama and KV caching (#8301) - more model support (#9080) with more coming soon Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
- 16 Jan, 2025 1 commit
-
-
Josh authored
--------- Co-authored-by:Patrick Devine <patrick@infrahq.com>
-
- 18 Oct, 2024 1 commit
-
-
Patrick Devine authored
Co-authored-by:
jmorganca <jmorganca@gmail.com> Co-authored-by:
Michael Yang <mxyng@pm.me> Co-authored-by:
Jesse Gross <jesse@ollama.com>
-
- 28 Aug, 2024 1 commit
-
-
Patrick Devine authored
-
- 27 Aug, 2024 1 commit
-
-
Michael Yang authored
-
- 02 Aug, 2024 1 commit
-
-
Michael Yang authored
-
- 20 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 16 Jul, 2024 2 commits
-
-
Michael Yang authored
this change is triggered by the presence of "suffix", particularly useful for code completion tasks
-
Michael Yang authored
-
- 15 Jul, 2024 1 commit
-
-
Michael Yang authored
-
- 13 Jul, 2024 1 commit
-
-
Michael Yang authored
* fix system prompt * execute template when hitting previous roles * fix tests --------- Co-authored-by:jmorganca <jmorganca@gmail.com>
-
- 12 Jul, 2024 3 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 11 Jul, 2024 4 commits
-
-
Michael Yang authored
This reverts commit 19753c18. for compat. messages will be added at a later date
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 05 Jul, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 01 Jul, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-