Commits · fa7776fd2458fc3a8aeb7f12e4bc65b439955319 · OpenDAS / ollama

05 Aug, 2025 1 commit

Michael Yang authored Aug 05, 2025



* bf16

* tests

* gpt-oss

* enable gptoss for engine

* rough estimate

* convert to mxfp4

* handle safetensors U8

* clamp glu/linear

* update tokenizer

* MXFP4 support

This implements the Open Compute Microscaling (MX) FP4 format
as a tensor type with backend implementations focusing
on mulmat and mulmatid on CPU, CUDA, and Metal.

* Unit tests for MXFP4 support

This exercises various operations and shapes on both CPU and GPU (if detected
on the system)

* cuda graph

* unit test adjustments

* cuda: optimize memory access

Read 4 bytes at a time (8 elements) when performing mul_mat_vec_mxfp4

* mac: fix crash on old macos versions

cblas_sgemm is only supported on v13.3 and up, however bf16 is
only supported on v14+ so we were falling back to ggml-blas and
crashing on bf16 tensors.  Checking for the function being null
seems to be the simplest way to condittionally avoid registering the
backend.

* server: Minimum context length for gptoss

This model requires a minimum context length of 8192 to function
effectively. Users can set higher values through all normal mechanisms
but lower values will be silently reset.

* ggml: Multiply by numParallel for gptoss sliding window

When computing the graph size estimate, the context size is already
multiplied by numParallel so estimates reflect that. However, since
sliding window models use a smaller, fixed context size, they need
to manually take numParallel into account.

* gpt-oss integration

includes harmony parser and thinking levels, etc.

* fix sync

* fix tests

* fix lint

---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
Co-authored-by: Jesse Gross <jesse@ollama.com>
Co-authored-by: Devon Rifkin <drifkin@drifkin.net>

fa7776fd

17 Jul, 2025 1 commit
- openai: allow openai endpoint to accept webp images (#11412) · 5e67f4f9
  frob authored Jul 17, 2025
```
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
  5e67f4f9
02 Apr, 2025 1 commit

chore(all): replace instances of interface with any (#10067) · 9876c9fa

Bruce MacDonald authored Apr 02, 2025

Both interface{} and any (which is just an alias for interface{} introduced in Go 1.18) represent the empty interface that all types satisfy.

9876c9fa

13 Feb, 2025 1 commit
- openai: finish_reason as tool_calls for streaming with tools (#7963) · 10d59d5f
  Anuraag (Rag) Agrawal authored Feb 14, 2025
  
  10d59d5f
13 Dec, 2024 1 commit

openai: return usage as final chunk for streams (#6784) · e28f2d49

Anuraag (Rag) Agrawal authored Dec 13, 2024



* openai: return usage as final chunk for streams

---------
Co-authored-by: ParthSareen <parth.sareen@ollama.com>

e28f2d49

11 Dec, 2024 1 commit

llama: preserve field order in user-defined JSON schemas (#8002) · 9039c821

Blake Mizerany authored Dec 11, 2024

Previously we decoded and re-encoded JSON schemas during validation,
which served no purpose since json.RawMessage already validates JSON
syntax. Worse, the re-encoding lost field ordering from the original
schema, which affects inference quality during step-by-step reasoning.

While fixing this ordering issue by using json.RawMessage directly,
testing revealed that schema_to_grammar (from llama.cpp) also fails to
preserve field order during grammar generation. This appears to be the
root cause of inference degradation.

This change prevents us from mangling the user's original schema order,
but we still need to address the ordering issue in schema_to_grammar.
That will be a separate change.

Updates #7978

9039c821

05 Dec, 2024 1 commit

api: structured outputs - chat endpoint (#7900) · 630e7dc6

Parth Sareen authored Dec 04, 2024



Adds structured outputs to chat endpoint
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Hieu Nguyen <hieunguyen1053@outlook.com>

630e7dc6

30 Nov, 2024 1 commit
- Enable index tracking for tools - openai api support (#7888) · 5f805118
  Parth Sareen authored Nov 29, 2024
  
  5f805118
27 Nov, 2024 2 commits
- api: enable tool streaming (#7836) · ce7455a8
  Parth Sareen authored Nov 27, 2024
  
  ce7455a8
- openai: remove unused error code (#7850) · 940e6277
  Bruce MacDonald authored Nov 26, 2024
```
The writeError takes a code argument which is no longer used. Remove it for clarity.
```
  940e6277
07 Sep, 2024 2 commits
- openai: align chat temperature and frequency_penalty options with completion (#6688) · 06d4fba8
  frob authored Sep 07, 2024
  
  06d4fba8
- openai: don't scale temperature or frequency_penalty (#6514) · da915345
  Yaroslav authored Sep 07, 2024
  
  da915345
06 Sep, 2024 1 commit
- openai: fix "presence_penalty" typo and add test (#6665) · fe91d7ff
  frob authored Sep 06, 2024
  
  fe91d7ff
02 Aug, 2024 1 commit
- lint · b732beba
  Michael Yang authored Aug 01, 2024
  
  b732beba
01 Aug, 2024 1 commit

OpenAI: Add Usage to `v1/embeddings` (#5886) · 6f133a0b

royjhan authored Aug 01, 2024

* add prompt tokens to embed response

* rm slog

* metrics

* types

* prompt n

* clean up

* reset submodule

* add tokens to v1/embeddings

* separate usage

6f133a0b

29 Jul, 2024 1 commit
- return tool calls finish reason for openai (#5995) · 365431d4
  royjhan authored Jul 29, 2024
```
* hot fix

* backend stream support

* clean up

* finish reason

* move to openai
```
  365431d4
19 Jul, 2024 2 commits
- OpenAI: Function Based Testing (#5752) · c57317cb
  royjhan authored Jul 19, 2024
```
* distinguish error forwarding

* more coverage

* rm comment
```
  c57317cb
- adjust openai chat msg processing (#5729) · 51b2fd29
  royjhan authored Jul 19, 2024
  
  51b2fd29
17 Jul, 2024 2 commits

OpenAI: Support Tools (#5614) · 154f6f45

royjhan authored Jul 16, 2024



* reopen pr

* tools

* remove tc from stream for now

* ID and Function

* openai expects arguments to be a string (#5739)

* mutually exclusive content and tool calls

* clean up

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

154f6f45

OpenAI: Add Suffix to `v1/completions` (#5611) · 0d41623b

royjhan authored Jul 16, 2024

* add suffix

* remove todo

* remove TODO

* add to test

* rm outdated prompt tokens info md

* fix test

* fix test

0d41623b

16 Jul, 2024 1 commit

OpenAI: /v1/embeddings compatibility (#5285) · 987dbab0

royjhan authored Jul 16, 2024



* OpenAI v1 models

* Empty List Testing

* Add back envconfig

* v1/models docs

* Remove Docs

* OpenAI batch embed compatibility

* merge conflicts

* integrate with api/embed

* ep

* merge conflicts

* request tests

* rm resp test

* merge conflict

* merge conflict

* test fixes

* test fn renaming

* input validation for empty string

---------
Co-authored-by: jmorganca <jmorganca@gmail.com>

987dbab0

14 Jul, 2024 1 commit

Support image input for OpenAI chat compatibility (#5208) · e9f7f360

royjhan authored Jul 13, 2024



* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author
Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* Support image input for OpenAI chat

* Decoding

* Fix message processing logic

* openai vision test

* type errors

* clean up

* redundant check

* merge conflicts

* merge conflicts

* merge conflicts

* flattening and smaller image

* add test

* support python and js SDKs and mandate prefixing

* clean up

---------
Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

e9f7f360

09 Jul, 2024 1 commit
- OpenAI v1/completions: allow stop token list (#5551) · 4918fae5
  royjhan authored Jul 09, 2024
```
* stop token parsing fix

* add stop test
```
  4918fae5
02 Jul, 2024 2 commits

OpenAI: v1/completions compatibility (#5209) · d626b99b

royjhan authored Jul 02, 2024



* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author
Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Completions Endpoint

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* Rename function

* float types

* type cleanup

* cleaning

* more cleaning

* Extra test cases

* merge conflicts

* merge conflicts

* merge conflicts

* merge conflicts

* cleaning

* cleaning

---------
Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

d626b99b

OpenAI: /v1/models and /v1/models/{model} compatibility (#5007) · 996bb1b8

royjhan authored Jul 02, 2024



* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author
Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* OpenAI: /v1/models/{model} compatibility (#5028)

* Retrieve Model

* OpenAI Delete Model

* Retrieve Middleware

* Remove Delete from Branch

* Update Test

* Middleware Test File

* Function name

* Cleanup

* Test Update

* Test Update

---------
Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

996bb1b8

14 Jun, 2024 1 commit
- openai: do not set temperature to 0 when setting seed (#5045) · 6b800aa7
  Jeffrey Morgan authored Jun 14, 2024
  
  6b800aa7
04 Jun, 2024 1 commit
- lint · e40145a3
  Michael Yang authored May 21, 2024
  
  e40145a3
11 May, 2024 1 commit
- Fix OpenAI `finish_reason` values when empty (#4368) · 41ba3017
  Jeffrey Morgan authored May 11, 2024
  
  41ba3017
09 May, 2024 1 commit
- add done_reason to the api (#4235) · cfa84b84
  Bruce MacDonald authored May 09, 2024
  
  cfa84b84
26 Mar, 2024 1 commit
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b
07 Feb, 2024 1 commit
- Initial OpenAI `/v1/chat/completions` API compatibility (#2376) · 453f572f
  Jeffrey Morgan authored Feb 07, 2024
  
  453f572f