Commits · 9950f6ec247952f0b6a7bec758d4484cb9d3d97b · OpenDAS / ollama

04 Aug, 2025 1 commit
- gpt-oss · 9950f6ec
  Michael Yang authored Jun 03, 2025
  
  9950f6ec
26 Jun, 2025 1 commit

Michael Yang authored Jun 25, 2025

* update patches

* cherry pick metal mean kernel

* cherry pick cuda mean kernel

* gemma3n

73b642e6

16 May, 2025 1 commit

model: handle multiple eos tokens (#10577) · 333e3604

Michael Yang authored May 16, 2025

* get eos_token_id from generation_config.json

* refactor

* include both ids and strings in trace

* comments

* remove special case for gemma3 special vocab (#10743)

333e3604

14 May, 2025 2 commits
- model: add Qwen2.5-VL support (#10385) · 0aa8b371
  Bruce MacDonald authored May 13, 2025
  
  0aa8b371
- chore: update mllama to use ollama engine (#10637) · 23125648
  Michael Yang authored May 13, 2025
  
  23125648
06 May, 2025 1 commit

Move quantization to new backend (#10363) · 42481045

Daniel Hiltgen authored May 06, 2025

* Move quantization logic to GGML via new backend

This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.

* Remove "add model quantizations"

This is no longer needed now that quantization is implemented in Go+GGML code directly.

42481045

25 Apr, 2025 2 commits
- llama4 · f0c66e6d
  Michael Yang authored Apr 03, 2025
  
  f0c66e6d
- convert: change to colmajor · 4892872c
  Michael Yang authored Apr 25, 2025
  
  4892872c
03 Apr, 2025 1 commit

model: support for mistral-small in the ollama runner · 6bd0a983

Bruce MacDonald authored Mar 14, 2025

Mistral is a popular research lab making open source models. This updates
the forward pass of llama architecture models to support both llama models
and mistral models by accounting for additional metadata present in mistral
models, and finding the correct dimensions for the output projection.

6bd0a983

18 Mar, 2025 1 commit
- convert: return name of unsupported architecture (#9862) · 61a88252
  Bruce MacDonald authored Mar 18, 2025
```
When a model's architecture cannot be converted return the name of the unsupported arch in the error message.
```
  61a88252
11 Mar, 2025 4 commits
- all: address linter errors · 83f0ec82
  jmorganca authored Mar 11, 2025
  
  83f0ec82
- fix conversion · c62861f4
  Patrick Devine authored Mar 07, 2025
  
  c62861f4
- add gemma vision encoder · 4b037a97
  Michael Yang authored Mar 06, 2025
  
  4b037a97
- gemma2 impl · 5f74d1fd
  Patrick Devine authored Feb 07, 2025
  
  5f74d1fd
14 Feb, 2025 1 commit

next ollama runner (#7913) · 58245413

Michael Yang authored Feb 14, 2025



feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

58245413

16 Jan, 2025 1 commit
- convert: import support for command-r models from safetensors (#6063) · 93a8daf2
  Josh authored Jan 15, 2025
```
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
```
  93a8daf2
14 Jan, 2025 1 commit

convert: qwen2 from safetensors (#8408) · f6f37130

Bruce MacDonald authored Jan 14, 2025

Add native support for converting Qwen2 family models (including Qwen2.5)
from safetensors to gguf format so we can run it.

f6f37130

10 Sep, 2024 1 commit
- catch when model vocab size is set correctly (#6714) · 84b84ce2
  Patrick Devine authored Sep 09, 2024
  
  84b84ce2
23 Aug, 2024 1 commit
- convert safetensor adapters into GGUF (#6327) · 0c819e16
  Patrick Devine authored Aug 23, 2024
  
  0c819e16
21 Aug, 2024 2 commits
- convert gemma2 · 3546bbd0
  Michael Yang authored Jun 28, 2024
  
  3546bbd0
- bert · 5a28b9cf
  Michael Yang authored Jun 06, 2024
  
  5a28b9cf
12 Aug, 2024 1 commit
- add conversion for microsoft phi 3 mini/medium 4k, 128 · 6ffb5cb0
  Michael Yang authored Jun 03, 2024
  
  6ffb5cb0
31 Jul, 2024 3 commits
- convert: only extract large files · eafc607a
  Michael Yang authored Jun 29, 2024
  
  eafc607a
- comments · df993fa3
  Michael Yang authored Jul 08, 2024
  
  df993fa3
- refactor convert · 5e9db9fb
  Michael Yang authored May 31, 2024
  
  5e9db9fb
04 Jun, 2024 1 commit
- lint · e40145a3
  Michael Yang authored May 21, 2024
  
  e40145a3
20 May, 2024 5 commits
- cleanup · bbbd9f20
  Michael Yang authored May 15, 2024
  
  bbbd9f20
- bpe pretokenizer · 547132e8
  Michael Yang authored May 15, 2024
  
  547132e8
- add fixes for llama · d355d202
  Patrick Devine authored May 08, 2024
  
  d355d202
- llama3 conversion · c8cf0d94
  Patrick Devine authored Apr 28, 2024
  
  c8cf0d94
- some changes for llama3 · d88582df
  Patrick Devine authored Apr 18, 2024
  
  d88582df
06 May, 2024 1 commit

quantize any fp16/fp32 model · 9685c345

Michael Yang authored Apr 12, 2024

- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}

9685c345

24 Apr, 2024 1 commit
- add mixtral 8x7b model conversion (#3859) · ce8ce825
  Patrick Devine authored Apr 23, 2024
  
  ce8ce825
15 Apr, 2024 1 commit
- Add llama2 / torch models for `ollama create` (#3607) · 9f8691c6
  Patrick Devine authored Apr 15, 2024
  
  9f8691c6
06 Apr, 2024 1 commit
- no rope parameters · be517e49
  Michael Yang authored Apr 05, 2024
  
  be517e49
01 Apr, 2024 1 commit
- Simplify model conversion (#3422) · 3b6a9154
  Patrick Devine authored Apr 01, 2024
  
  3b6a9154
29 Mar, 2024 1 commit
- Add gemma safetensors conversion (#3250) · 5a5efee4
  Patrick Devine authored Mar 28, 2024
```
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  5a5efee4
26 Mar, 2024 1 commit
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b
11 Mar, 2024 1 commit
- convert: fix shape · 9ea492f1
  Michael Yang authored Mar 10, 2024
  
  9ea492f1
08 Mar, 2024 1 commit
- decode ggla · 76bdebba
  Michael Yang authored Mar 08, 2024
  
  76bdebba