Commits · ffbe8e076df9e2e67aab016ea3ec64822369b725 · OpenDAS / ollama

"tests/test_structures/test_bbox/test_box3d.py" did not exist on "82cd4892fb8ed7a45e77e59241beeb8b531aada4"

15 Dec, 2025 1 commit
- model: add olmo3 and olmo3.1 (#13415) · ffbe8e07
  Parth Sareen authored Dec 15, 2025
  
  ffbe8e07
13 Dec, 2025 2 commits
- model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) · 4ff8a691
  Jeffrey Morgan authored Dec 12, 2025
  
  4ff8a691
- model: fix global layer rope scale values for gemma 3 (#13452) · 1b308e1d
  Jeffrey Morgan authored Dec 12, 2025
  
  1b308e1d
12 Dec, 2025 2 commits
- model: force rope factor 1.0 for Gemma 3 (#13445) · 3af5d3b7
  Jeffrey Morgan authored Dec 12, 2025
  
  3af5d3b7
- model: fix rotary embeddings for ministral 3 (#13432) · 2dfb7441
  Jeffrey Morgan authored Dec 11, 2025
  
  2dfb7441
11 Dec, 2025 1 commit
- model: conversion and hyperparameter fixes for ministral and devstral (#13424) · a838421e
  Jeffrey Morgan authored Dec 11, 2025
  
  a838421e
09 Dec, 2025 2 commits
- nomic-embed-text:v2: model implementation (#13162) · 76f88caf
  nicole pardal authored Dec 09, 2025
  
  76f88caf
- model: add rnj-1 inference support (#13354) · d2f334c1
  Jeffrey Morgan authored Dec 08, 2025
  
  d2f334c1
08 Dec, 2025 1 commit

Michael Yang authored Nov 18, 2025

change to a flatter directory structure and group the options with the
function

update models to call rope in one place

603ceefa

02 Dec, 2025 1 commit

model: ministral w/ llama4 scaling (#13292) · d3e0a0de

Patrick Devine authored Dec 01, 2025



This change:

* fixes rope scaling in the mistral converter
* updates ministral to include llama4 scaling
* includes a new ministral parser for parsing reasoning and tool calling

---------
Co-authored-by: jmorganca <jmorganca@gmail.com>

d3e0a0de

20 Nov, 2025 1 commit

deepseek2: upgrade to run v3+ models (#13166) · 5c1063df

Michael Yang authored Nov 19, 2025

the check for mla omits v3 and r1 which should not return unsupported.
instead check the tokenizer for compatibility

5c1063df

19 Nov, 2025 3 commits
- models: enable deepseek2 (deepseek v3.1 w/ MLA) on the new engine (#13151) · 604e43b2
  Patrick Devine authored Nov 18, 2025
  
  604e43b2
- nomic-embed-text model implementation (#13071) · 8de30b56
  nicole pardal authored Nov 18, 2025
  
  8de30b56
- deepseekocr · 92981ae3
  Michael Yang authored Oct 31, 2025
  
  92981ae3
18 Nov, 2025 1 commit
- Add deepseek v3.1 (#13063) · 584e2d64
  Grace authored Nov 17, 2025
```
* Add mla for flash attention
* Revert to using chunks
```
  584e2d64
13 Nov, 2025 1 commit

chore: update models to use slice/chunk/chunksections (#12934) · 333203d8

Michael Yang authored Nov 13, 2025

* use slice/chunks

* bert

* llama4

* gemma3n

* gptoss

* mistral3

* qwen3vl

* qwen25vl

* deepseek2

* remove unused ops

333203d8

06 Nov, 2025 1 commit
- ggml update to b6840 (#12791) · 544b6739
  Daniel Hiltgen authored Nov 06, 2025
  
  544b6739
03 Nov, 2025 1 commit
- chore(gptoss): cleanup dead code (#12932) · ce3eb0a3
  Michael Yang authored Nov 03, 2025
  
  ce3eb0a3
30 Oct, 2025 2 commits
- interleaved mrope (#12807) · f67a6df1
  Michael Yang authored Oct 30, 2025
```
* ml(ggml): mrope
* interleave mrope
```
  f67a6df1
- fix: qwen2.5vl, qwen3vl composite image (#12841) · d432ade7
  Michael Yang authored Oct 30, 2025
```
this change fixes images with an alpha channel by overlaying the image
onto a white background
```
  d432ade7
29 Oct, 2025 1 commit
- feat(model): add qwen3vl (#12665) · 7d25b9e1
  Michael Yang authored Oct 28, 2025
  
  7d25b9e1
28 Oct, 2025 2 commits
- s/From*Slice/From*s/ (#12255) · 1188f408
  Michael Yang authored Oct 28, 2025
  
  1188f408
- gemma3: make embedding non-causal (#12297) · ec9eb28f
  Michael Yang authored Oct 27, 2025
  
  ec9eb28f
18 Oct, 2025 1 commit
- contiguous input per layer (#12686) · bc1a818f
  Daniel Hiltgen authored Oct 17, 2025
```
Co-authored-by: Michael Yang <git@mxy.ng>
```
  bc1a818f
13 Oct, 2025 1 commit
- fix(qwen3): deepseek distill · 6c833d5f
  Michael Yang authored Oct 13, 2025
```
deepseek's qwen3 distill uses a different rope scheme so support both
```
  6c833d5f
09 Oct, 2025 2 commits
- refactor: use builtin max and min · 47298fce
  shengxinjing authored Sep 28, 2025
  
  47298fce
- refactor: use builtin max and min · 4a48937e
  shengxinjing authored Sep 25, 2025
  
  4a48937e
03 Oct, 2025 1 commit
- Fixed Deepseek2 adding nil tensor error · 33801c15
  Grace authored Oct 03, 2025
  
  33801c15
24 Sep, 2025 1 commit

Grace/deepseek v3 migration (#12385) · fbd82ba5

Grace authored Sep 24, 2025



* init deepseek model file

* temp removal of flash attention implementation

* shapes and proper, can make a pass

* query, key, value have good cosine similarity, but the max diff is a bit high

* Attention block is working! ** with eager for now, have not added the mask line

* Attention block is working! ** with eager for now, have not added the mask line

* working MoE at around 0.95 cosine sim

* added cosine similarity function

* Starting end to end structure

* Trying (and failing) to get rope to work, going to test full thing on tater

* running on tater36... just not the right outputs

* we have the right values for rope... but its still not working?

* chnage Extrapolation Factor to 1

* removed adding residuals twice, removed normalization from shared expert, refactored Norms (Attention, MLP) to be outside the (Attention, MLP) blocks and in the Transformer block instead, add cache setLayer

* Temporary modelfiles for cpu

* change kpass intermediate step to kv, two layer outputs [0,1] look fine

* this calls for 16 chicken nuggets

* whoops

* cleaning up code

* delete stuff we dont need

* getting rid of debug statements for llama cpp

* working with long contexts

* fix long context view error

* reverting some changes I made for files that are not apart of pr

* Added proper tokenizer for deeepseek3

* clean up model and go test

* remove Modelfile

* not passing the tests

* whoops

* how to pass the ci tests

* resolving some of the comments

* rename

* linted and renamed deepseek3 -> deepseek2

* remove name go

* addressed changes - main change was adopting qwen3 naming scheme

* I cannot with linters

* clean up logs

* clean up logs

---------
Co-authored-by: Grace Guo <graceguo@Graces-MBP.localdomain>
Co-authored-by: Grace Guo <graceguo@Graces-MacBook-Pro.local>
Co-authored-by: graceguo <graceguo@tater36.localdomain>

fbd82ba5

23 Sep, 2025 2 commits
- add pre:, suf: to tags (#12274) · bf78ed6e
  Michael Yang authored Sep 23, 2025
  
  bf78ed6e
- multi-regexp pretokenizer (#12325) · a40d427b
  Michael Yang authored Sep 23, 2025
  
  a40d427b
19 Sep, 2025 1 commit
- gemma: fix rope scaling for qat models (#12348) · dba39b2e
  Patrick Devine authored Sep 19, 2025
```
* gemma: fix rope scaling for qat models

* gofumpt yourself
```
  dba39b2e
18 Sep, 2025 1 commit
- feat: qwen3 embed (#12301) · 7460259e
  Michael Yang authored Sep 18, 2025
```
* cleanup

* use pooling.TypeNone

* pooling test

* qwen3 embed
```
  7460259e
17 Sep, 2025 1 commit
- fix(llama): other llama flavours (#12308) · 564b558c
  Michael Yang authored Sep 17, 2025
```
* fix(llama): rope scale

* spm llama

* skip moe models

* cleanup
```
  564b558c
16 Sep, 2025 2 commits
- use split activations when possible (#12293) · ad95d5b3
  Michael Yang authored Sep 16, 2025
```
* use ggml_*_split activations when possible

* forward qkv
```
  ad95d5b3
- embed: cleanup (#12299) · c253433d
  Michael Yang authored Sep 16, 2025
```
* cleanup

* use pooling.TypeNone

* pooling test
```
  c253433d
15 Sep, 2025 2 commits

model: implement bert in ollama engine (#9080) · 3f6642f6

Michael Yang authored Sep 15, 2025

* fix truncate

* s/SentencePieceModel/SentencePiece/

* bert

* wordpiece

* refactor pooling

* more tokenizers

* normalize embeddings

3f6642f6

batch: use tensors for outputs (#12185) · 6f711714
Michael Yang authored Sep 15, 2025
```
this cleans up the model interface slightly without too much impact in
other areas
```
6f711714

04 Sep, 2025 1 commit
- embedding gemma model (#12181) · 5994e8e8
  Michael Yang authored Sep 04, 2025
```
* ollama: add embeddings
```
  5994e8e8
29 Aug, 2025 1 commit

perf: build graph for next batch async to keep GPU busy (#11863) · 517807cd

Daniel Hiltgen authored Aug 29, 2025

* perf: build graph for next batch in parallel to keep GPU busy

This refactors the main run loop of the ollama runner to perform the main GPU
intensive tasks (Compute+Floats) in a go routine so we can prepare the next
batch in parallel to reduce the amount of time the GPU stalls waiting for the
next batch of work.

* tests: tune integration tests for ollama engine

This tunes the integration tests to focus more on models supported
by the new engine.

517807cd