Commits · f8c3dbe5b5ee342d97e4c71d684b85b00273c33d · OpenDAS / ollama

20 Mar, 2025 1 commit
- templates: add autotemplate for gemma3 (#9880) · f8c3dbe5
  Patrick Devine authored Mar 20, 2025
```
This change allows the gemma3 template to be autodetected during `ollama
create`.
```
  f8c3dbe5
14 Feb, 2025 1 commit

Michael Yang authored Feb 14, 2025



feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

58245413

01 Jan, 2025 1 commit
- Update the /api/create endpoint to use JSON (#7935) · 86a622cb
  Patrick Devine authored Dec 31, 2024
```
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
```
  86a622cb
18 Oct, 2024 1 commit

image processing for llama3.2 (#6963) · c7cb0f06

Patrick Devine authored Oct 18, 2024


Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>

c7cb0f06

18 Sep, 2024 1 commit
- server: add tool parsing support for nemotron-mini (#6849) · d05da299
  Jeffrey Morgan authored Sep 17, 2024
  
  d05da299
23 Aug, 2024 1 commit
- convert safetensor adapters into GGUF (#6327) · 0c819e16
  Patrick Devine authored Aug 23, 2024
  
  0c819e16
12 Aug, 2024 3 commits
- cmd: speed up gguf creates (#6324) · 980dd15f
  Josh authored Aug 12, 2024
  
  980dd15f
- Revert "server: speed up single gguf creates (#5898)" (#6323) · 1dc3ef3a
  Josh authored Aug 12, 2024
```
This reverts commit 8aac2243.
```
  1dc3ef3a
- server: speed up single gguf creates (#5898) · 8aac2243
  Josh authored Aug 12, 2024
  
  8aac2243
08 Aug, 2024 1 commit

manifest: Store layers inside manifests consistently as values. · 7edaf6e7

Jesse Gross authored Aug 07, 2024

Commit 1829fb61 ("manifest: Fix crash on startup when trying to clean up
unused files (#5840)") changed the config layer stored in manifests
from a pointer to a value. This was done in order to avoid potential
nil pointer dereferences after it is deserialized from JSON in the
event that the field is missing.

This changes the Layers slice to also be stored by value. This enables
consistency in handling across the two objects.

7edaf6e7

31 Jul, 2024 2 commits
- convert: only extract large files · eafc607a
  Michael Yang authored Jun 29, 2024
  
  eafc607a
- refactor convert · 5e9db9fb
  Michael Yang authored May 31, 2024
  
  5e9db9fb
22 Jul, 2024 1 commit
- server: collect nested tool call objects when parsing (#5824) · b3e5491e
  Jeffrey Morgan authored Jul 22, 2024
  
  b3e5491e
18 Jul, 2024 1 commit
- fix parsing tool calls · 43606d6d
  Michael Yang authored Jul 18, 2024
  
  43606d6d
17 Jul, 2024 2 commits
- marshal json automatically for some template values (#5758) · b2554455
  Michael Yang authored Jul 17, 2024
  
  b2554455
- parse tool call as individual objects · 5fd69881
  Michael Yang authored Jul 17, 2024
  
  5fd69881
16 Jul, 2024 2 commits
- remove unneeded tool calls · 5a83f79a
  Michael Yang authored Jul 16, 2024
  
  5a83f79a
- fix unmarshal type errors · 5afbb60f
  Michael Yang authored Jul 16, 2024
  
  5afbb60f
15 Jul, 2024 1 commit
- tools · d02bbebb
  Michael Yang authored Jun 20, 2024
  
  d02bbebb
12 Jul, 2024 1 commit
- autodetect stop parameters from template · ebc529cb
  Michael Yang authored Jul 05, 2024
  
  ebc529cb
01 Jul, 2024 2 commits
- err on insecure path · 88bcd79b
  Michael Yang authored Jun 30, 2024
  
  88bcd79b
- rename templates to template · 58e3fff3
  Michael Yang authored Jun 10, 2024
  
  58e3fff3
27 Jun, 2024 1 commit
- zip: prevent extracting files into parent dirs (#5314) · 123a722a
  Michael Yang authored Jun 26, 2024
  
  123a722a
25 Jun, 2024 1 commit

llm: speed up gguf decoding by a lot (#5246) · cb42e607

Blake Mizerany authored Jun 24, 2024

Previously, some costly things were causing the loading of GGUF files
and their metadata and tensor information to be VERY slow:

  * Too many allocations when decoding strings
  * Hitting disk for each read of each key and value, resulting in a
    not-okay amount of syscalls/disk I/O.

The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro
m3.

This commit also prevents collecting large arrays of values when
decoding GGUFs (if desired). When such keys are encountered, their
values are null, and are encoded as such in JSON.

Also, this fixes a broken test that was not encoding valid GGUF.

cb42e607

12 Jun, 2024 1 commit

fix: multiple templates when creating from model · c16f8af9

Michael Yang authored Jun 12, 2024

multiple templates may appear in a model if a model is created from
another model that 1) has an autodetected template and 2) defines a
custom template

c16f8af9

04 Jun, 2024 2 commits
- update create handler to use model.Name · d61ef8b9
  Michael Yang authored May 08, 2024
  
  d61ef8b9
- lint · e40145a3
  Michael Yang authored May 21, 2024
  
  e40145a3
20 May, 2024 2 commits
- tidy intermediate blobs · f36f1d6b
  Michael Yang authored May 20, 2024
  
  f36f1d6b
- cache and reuse intermediate blobs · 3520c0e4
  Michael Yang authored May 10, 2024
```
particularly useful for zipfiles and f16s
```
  3520c0e4
06 May, 2024 5 commits
- close zip files · b2f00aa9
  Michael Yang authored May 06, 2024
  
  b2f00aa9
- s/DisplayLongest/String/ · f5e8b207
  Michael Yang authored May 01, 2024
  
  f5e8b207
- no iterator · 4d0d0fa3
  Michael Yang authored Apr 25, 2024
  
  4d0d0fa3
- comments · 01811c17
  Michael Yang authored Apr 23, 2024
  
  01811c17
- quantize any fp16/fp32 model · 9685c345
  Michael Yang authored Apr 12, 2024
```
- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}
```
  9685c345