Commits · 33ee7168ba1e16c813b52dc2c9417efa1e2e9f20 · OpenDAS / ollama

09 Jan, 2026 1 commit

Add experimental MLX backend and engine with imagegen support (#13648) · 33ee7168

Daniel Hiltgen authored Jan 08, 2026



* WIP - MLX backend with gemma3

* MLX: add cmake and go tag build toggles

To build the new MLX backend code:
  cmake --preset MLX
  cmake --build --preset MLX --parallel
  cmake --install build --component MLX
  go build -tags mlx .

Note: the main.go entrypoint for the MLX engine will change in a follow up commit.

* add experimental image generation runtime

* add experimental image generation runtime

* MLX: wire up cuda build for linux

* MLX: get dependencies correct and dedup

This is still too large for a unified github artifact, but is now "correct" for the mlx_cuda_v13
directory.

* fix relative link bug in dedup

* Add darwin build and readme

* add go build tag for mlx dependent code and wire up build_darwin.sh

* lint cleanup

* macos: build mlx for x86

This will be CPU only.

* cuda build instructions and fix drift from mlx bump

* stale comment

* Delete agent helper doc

* Clean up readme.md

* Revise README for tokenizer clarity and details

Updated README to clarify tokenizer functionality and removed correctness section.

---------
Co-authored-by: jmorganca <jmorganca@gmail.com>

33ee7168

16 Dec, 2025 1 commit

types: ConfigV2 and RootFS (#13504) · 45c47393

Bruce MacDonald authored Dec 16, 2025

Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.

45c47393

27 Oct, 2025 1 commit

create: inherit FROM model's renderer/parser · 1bdd8169

Devon Rifkin authored Oct 27, 2025

On main, the `RENDERER` and `PARSER` fields from the `Modelfile` don't
get propagated to a new model created with a `req.From` parameter. This
is easily triggered via `ollama run qwen3-coder`, then running some save
command like `/save qwen3-coder-custom`.

Added a regression test for this, and then open the config for the
"from" model in order to use its renderer/parser as a default for the
new model. This will fix the CLI and also API-based creates.

Fixes: https://github.com/ollama/ollama/issues/12792

1bdd8169

20 Oct, 2025 1 commit
- fs(ggml): fill in arch prefix if necessary (#12646) · d2b63c19
  Michael Yang authored Oct 20, 2025
  
  d2b63c19
17 Sep, 2025 1 commit
- engine: add remote proxy (#12307) · 8b894933
  Patrick Devine authored Sep 17, 2025
  
  8b894933
06 May, 2025 1 commit

Move quantization to new backend (#10363) · 42481045

Daniel Hiltgen authored May 06, 2025

* Move quantization logic to GGML via new backend

This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.

* Remove "add model quantizations"

This is no longer needed now that quantization is implemented in Go+GGML code directly.

42481045

14 Feb, 2025 1 commit

next ollama runner (#7913) · 58245413

Michael Yang authored Feb 14, 2025



feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

58245413

15 Jan, 2025 1 commit
- Fix absolute path names + gguf detection (#8428) · 2539f2db
  Patrick Devine authored Jan 14, 2025
  
  2539f2db
01 Jan, 2025 1 commit
- Update the /api/create endpoint to use JSON (#7935) · 86a622cb
  Patrick Devine authored Dec 31, 2024
```
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
```
  86a622cb
27 Aug, 2024 2 commits
- update templates to use messages · 413ae39f
  Michael Yang authored Aug 27, 2024
  
  413ae39f
- server: clean up route names for consistency (#6524) · 47fa0839
  Jeffrey Morgan authored Aug 26, 2024
  
  47fa0839
02 Aug, 2024 2 commits
- use testing tempdirs · a091fadf
  Michael Yang authored Aug 02, 2024
  
  a091fadf
- lint · b732beba
  Michael Yang authored Aug 01, 2024
  
  b732beba
31 Jul, 2024 2 commits
- comments · df993fa3
  Michael Yang authored Jul 08, 2024
  
  df993fa3
- refactor convert · 5e9db9fb
  Michael Yang authored May 31, 2024
  
  5e9db9fb
22 Jul, 2024 1 commit
- uint64 · 1954ec59
  Michael Yang authored Jul 03, 2024
  
  1954ec59
19 Jul, 2024 1 commit
- server: validate template (#5734) · e8b954c6
  Josh authored Jul 19, 2024
```
add template validation to modelfile
```
  e8b954c6
16 Jul, 2024 1 commit
- add chat and generate tests with mock runner · 4a565cbf
  Michael Yang authored Jul 13, 2024
  
  4a565cbf
12 Jul, 2024 1 commit
- autodetect stop parameters from template · ebc529cb
  Michael Yang authored Jul 05, 2024
  
  ebc529cb
11 Jul, 2024 2 commits
- revert embedded templates to use prompt/response · 57ec6901
  Michael Yang authored Jul 11, 2024
```
This reverts commit 19753c18.

for compat. messages will be added at a later date
```
  57ec6901
- add system prompt to first legacy template · 41be2809
  Michael Yang authored Jul 10, 2024
  
  41be2809
05 Jul, 2024 1 commit
- update named templates · fb6cbc02
  Michael Yang authored Jun 27, 2024
  
  fb6cbc02
13 Jun, 2024 1 commit
- add OLLAMA_MODELS to envconfig (#5029) · 94618b23
  Patrick Devine authored Jun 13, 2024
  
  94618b23
12 Jun, 2024 1 commit

fix: multiple templates when creating from model · c16f8af9

Michael Yang authored Jun 12, 2024

multiple templates may appear in a model if a model is created from
another model that 1) has an autodetected template and 2) defines a
custom template

c16f8af9

07 Jun, 2024 1 commit
- fix create model when template detection errors · 030e765e
  Michael Yang authored Jun 07, 2024
  
  030e765e
04 Jun, 2024 1 commit
- update create handler to use model.Name · d61ef8b9
  Michael Yang authored May 08, 2024
  
  d61ef8b9
14 May, 2024 1 commit
- update tests · c5e892cb
  Michael Yang authored May 13, 2024
  
  c5e892cb