Commits · 20c5fd39c8b275c0c7d7e7be8ce03d48aa32c64e · OpenDAS / ollama

08 May, 2025 1 commit
- lint: enable usetesting, disable tenv (#10594) · 6e9a7a25
  Michael Yang authored May 08, 2025
  
  6e9a7a25
07 May, 2025 1 commit
- api: remove unused RetrieveModelResponse type (#10603) · 392de840
  Jeffrey Morgan authored May 06, 2025
  
  392de840
06 May, 2025 1 commit
- server: send 405 instead of 404 for unallowed methods (#10275) · 4090aca9
  Devon Rifkin authored May 06, 2025
```
Fixes: #5483
```
  4090aca9
30 Apr, 2025 1 commit

strip out thinking tags in message history for qwen3 & r1 (#10490) · ad3c7c9b

Devon Rifkin authored Apr 30, 2025

* strip out thinking tags in message history for qwen3 & r1

This is in advance of "proper" support where we'll make reasoning
configurable and we'll parse out thinking/reasoning tags and provide
them to the caller. These models expect there to be no thinking tags in
the message history, so this should improve quality

* parse model names instead of hacky prefix check

ad3c7c9b

03 Mar, 2025 1 commit

server/internal/client/ollama: hold DiskCache on Registry (#9463) · 3519dd1c

Blake Mizerany authored Mar 02, 2025

Previously, using a Registry required a DiskCache to be passed in for
use in various methods. This was a bit cumbersome, as the DiskCache is
required for most operations, and the DefaultCache is used in most of
those cases. This change makes the DiskCache an optional field on the
Registry struct.

This also changes DefaultCache to initialize on first use. This is to
not burden clients with the cost of creating a new cache per use, or
having to hold onto a cache for the lifetime of the Registry.

Also, slip in some minor docs updates for Trace.

3519dd1c

27 Feb, 2025 1 commit

server/internal: replace model delete API with new registry handler. (#9347) · 2412adf4

Blake Mizerany authored Feb 27, 2025

This commit introduces a new API implementation for handling
interactions with the registry and the local model cache. The new API is
located in server/internal/registry. The package name is "registry" and
should be considered temporary; it is hidden and not bleeding outside of
the server package. As the commits roll in, we'll start consuming more
of the API and then let reverse osmosis take effect, at which point it
will surface closer to the root level packages as much as needed.

2412adf4

14 Feb, 2025 1 commit

next ollama runner (#7913) · 58245413

Michael Yang authored Feb 14, 2025



feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

58245413

01 Jan, 2025 1 commit
- Update the /api/create endpoint to use JSON (#7935) · 86a622cb
  Patrick Devine authored Dec 31, 2024
```
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
```
  86a622cb
11 Dec, 2024 1 commit
- server: more support for mixed-case model names (#8017) · b1fd7fef
  Blake Mizerany authored Dec 11, 2024
```
Fixes #7944
```
  b1fd7fef
19 Nov, 2024 1 commit

server: allow mixed-case model names on push, pull, cp, and create (#7676) · 4b8a2e34

Blake Mizerany authored Nov 19, 2024

This change allows for mixed-case model names to be pushed, pulled,
copied, and created, which was previously disallowed because the Ollama
registry was backed by a Docker registry that enforced a naming
convention that disallowed mixed-case names, which is no longer the
case.

This does not break existing, intended, behaviors.

Also, make TestCase test a story of creating, updating, pulling, and
copying a model with case variations, ensuring the model's manifest is
updated correctly, and not duplicated across different files with
different case variations.

4b8a2e34

18 Oct, 2024 1 commit

image processing for llama3.2 (#6963) · c7cb0f06

Patrick Devine authored Oct 18, 2024


Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>

c7cb0f06

01 Oct, 2024 1 commit
- Stop model before deletion if loaded (fixed #6957) (#7050) · f40bb398
  Alex Mavrogiannis authored Oct 01, 2024
  
  f40bb398
27 Aug, 2024 1 commit
- server: clean up route names for consistency (#6524) · 47fa0839
  Jeffrey Morgan authored Aug 26, 2024
  
  47fa0839
13 Aug, 2024 1 commit
- Load Embedding Model on Empty Input (#6325) · 8b00a415
  royjhan authored Aug 13, 2024
```
* load on empty input

* no load on invalid input
```
  8b00a415
02 Aug, 2024 1 commit
- lint · b732beba
  Michael Yang authored Aug 01, 2024
  
  b732beba
22 Jul, 2024 1 commit
- uint64 · 1954ec59
  Michael Yang authored Jul 03, 2024
  
  1954ec59
16 Jul, 2024 1 commit
- server: return empty slice on empty `/api/embed` request (#5713) · 7ac6d462
  Jeffrey Morgan authored Jul 15, 2024
```
* server: return empty slice on empty `/api/embed` request

* fix tests
```
  7ac6d462
15 Jul, 2024 1 commit

Introduce `/api/embed` endpoint supporting batch embedding (#5127) · b9f5e16c

royjhan authored Jul 15, 2024

* Initial Batch Embedding

* Revert "Initial Batch Embedding"

This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29.

* Initial Draft

* mock up notes

* api/embed draft

* add server function

* check normalization

* clean up

* normalization

* playing around with truncate stuff

* Truncation

* Truncation

* move normalization to go

* Integration Test Template

* Truncation Integration Tests

* Clean up

* use float32

* move normalize

* move normalize test

* refactoring

* integration float32

* input handling and handler testing

* Refactoring of legacy and new

* clear comments

* merge conflicts

* touches

* embedding type 64

* merge conflicts

* fix hanging on single string

* refactoring

* test values

* set context length

* clean up

* testing clean up

* testing clean up

* remove function closure

* Revert "remove function closure"

This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787.

* remove function closure

* remove redundant error check

* clean up

* more clean up

* clean up

b9f5e16c

02 Jul, 2024 1 commit

OpenAI: /v1/models and /v1/models/{model} compatibility (#5007) · 996bb1b8

royjhan authored Jul 02, 2024



* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author
Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* OpenAI: /v1/models/{model} compatibility (#5028)

* Retrieve Model

* OpenAI Delete Model

* Retrieve Middleware

* Remove Delete from Branch

* Update Test

* Middleware Test File

* Function name

* Cleanup

* Test Update

* Test Update

---------
Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

996bb1b8

19 Jun, 2024 1 commit

Extend api/show and ollama show to return more model info (#4881) · fedf7163

royjhan authored Jun 19, 2024



* API Show Extended

* Initial Draft of Information
Co-Authored-By: Patrick Devine <pdevine@sonic.net>

* Clean Up

* Descriptive arg error messages and other fixes

* Second Draft of Show with Projectors Included

* Remove Chat Template

* Touches

* Prevent wrapping from files

* Verbose functionality

* Docs

* Address Feedback

* Lint

* Resolve Conflicts

* Function Name

* Tests for api/show model info

* Show Test File

* Add Projector Test

* Clean routes

* Projector Check

* Move Show Test

* Touches

* Doc update

---------
Co-authored-by: Patrick Devine <pdevine@sonic.net>

fedf7163

13 Jun, 2024 1 commit
- add OLLAMA_MODELS to envconfig (#5029) · 94618b23
  Patrick Devine authored Jun 13, 2024
  
  94618b23
07 Jun, 2024 1 commit
- fix create model when template detection errors · 030e765e
  Michael Yang authored Jun 07, 2024
  
  030e765e
06 Jun, 2024 1 commit
- Separate ListResponse and ModelResponse for api/tags vs api/ps (#4842) · 4bf1da49
  royjhan authored Jun 06, 2024
```
* Remove false time fields

* Struct Separation for List and Process

* Remove Marshaler
```
  4bf1da49
04 Jun, 2024 3 commits
- update create handler to use model.Name · d61ef8b9
  Michael Yang authored May 08, 2024
  
  d61ef8b9
- more lint · 8ce4032e
  Michael Yang authored May 29, 2024
  
  8ce4032e
- lint · e40145a3
  Michael Yang authored May 21, 2024
  
  e40145a3
24 May, 2024 1 commit
- Move envconfig and consolidate env vars (#4608) · 4cc3be30
  Patrick Devine authored May 24, 2024
  
  4cc3be30
20 May, 2024 1 commit
- Move the parser back + handle utf16 files (#4533) · ccdf0b2a
  Patrick Devine authored May 20, 2024
  
  ccdf0b2a
14 May, 2024 2 commits
- check if name exists before create/pull/copy · 85a57006
  Michael Yang authored May 13, 2024
  
  85a57006
- Fixed the API endpoint /api/tags when the model list is empty. (#4424) · 798b107f
  Ryo Machida authored May 15, 2024
```
* Fixed the API endpoint /api/tags to return {models: []} instead of {models: null} when the model list is empty.

* Update server/routes.go

---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
```
  798b107f
06 May, 2024 1 commit
- update tests · a7248f6e
  Michael Yang authored Apr 16, 2024
  
  a7248f6e
01 May, 2024 2 commits
- rename parser to model/file · 119589fc
  Michael Yang authored Apr 30, 2024
  
  119589fc
- refactor modelfile parser · c0a00f68
  Michael Yang authored Apr 22, 2024
  
  c0a00f68
08 Apr, 2024 1 commit
- cgo quantize · 9502e566
  Michael Yang authored Apr 05, 2024
  
  9502e566
01 Apr, 2024 1 commit

Switch back to subprocessing for llama.cpp · 58d95cc9

Daniel Hiltgen authored Mar 14, 2024

This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.

58d95cc9

29 Mar, 2024 1 commit
- Add gemma safetensors conversion (#3250) · 5a5efee4
  Patrick Devine authored Mar 28, 2024
```
Co-authored-by: Michael Yang <mxyng@pm.me>
```
  5a5efee4
26 Mar, 2024 1 commit
- change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) · 1b272d5b
  Patrick Devine authored Mar 26, 2024
  
  1b272d5b
09 Mar, 2024 1 commit
- add allowed host middleware and remove `workDir` middleware (#3018) · fc8c0445
  Jeffrey Morgan authored Mar 08, 2024
  
  fc8c0445
12 Feb, 2024 1 commit
- Fix issues with templating prompt in chat mode (#2460) · 48a273f8
  Jeffrey Morgan authored Feb 12, 2024
  
  48a273f8
01 Feb, 2024 1 commit
- fix tests · e49dc9f3
  Michael Yang authored Feb 01, 2024
  
  e49dc9f3