Commits · 86a622cbdc69e9fd501764ff7565e977fc98f00a · OpenDAS / ollama

01 Jan, 2025 1 commit
- Update the /api/create endpoint to use JSON (#7935) · 86a622cb
  Patrick Devine authored Dec 31, 2024
```
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
```
  86a622cb
23 Dec, 2024 1 commit
- server: reuse InvalidModelNameErrMsg type (#8163) · 928de905
  湛露先生 authored Dec 23, 2024
  
  928de905
15 Dec, 2024 1 commit
- imageproc mllama refactor (#7537) · 8c9fb8eb
  Patrick Devine authored Dec 14, 2024
```
Refactor mllama image processing code, and add pixtral and qwen2vl
```
  8c9fb8eb
11 Dec, 2024 1 commit
- server: more support for mixed-case model names (#8017) · b1fd7fef
  Blake Mizerany authored Dec 11, 2024
```
Fixes #7944
```
  b1fd7fef
10 Dec, 2024 2 commits

server: lowercase hostname for Host header check (#5851) · 757eeacc
frob authored Dec 10, 2024

757eeacc

build: Make target improvements (#7499) · 4879a234

Daniel Hiltgen authored Dec 10, 2024

* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.

4879a234

05 Dec, 2024 2 commits
- api: add generate endpoint for structured outputs (#7939) · c6c52627
  Parth Sareen authored Dec 04, 2024
  
  c6c52627
- api: structured outputs - chat endpoint (#7900) · 630e7dc6
  Parth Sareen authored Dec 04, 2024
```
Adds structured outputs to chat endpoint
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Hieu Nguyen <hieunguyen1053@outlook.com>
```
  630e7dc6
30 Nov, 2024 2 commits
- server: add warning message for deprecated context field (#7878) · d543b282
  Jeffrey Morgan authored Nov 30, 2024
  
  d543b282
- Enable index tracking for tools - openai api support (#7888) · 5f805118
  Parth Sareen authored Nov 29, 2024
  
  5f805118
27 Nov, 2024 1 commit
- api: enable tool streaming (#7836) · ce7455a8
  Parth Sareen authored Nov 27, 2024
  
  ce7455a8
23 Nov, 2024 1 commit
- openai: accept X-Stainless-Retry-Count header (#6910) · 31cb1ca9
  oza6ut0ne authored Nov 24, 2024
  
  31cb1ca9
20 Nov, 2024 1 commit
- expose underlying error on embedding failure (#7743) · f602ab4d
  Daniel Hiltgen authored Nov 19, 2024
```
Avoid a round-trip asking users for logs to see what went wrong.
```
  f602ab4d
19 Nov, 2024 1 commit

server: allow mixed-case model names on push, pull, cp, and create (#7676) · 4b8a2e34

Blake Mizerany authored Nov 19, 2024

This change allows for mixed-case model names to be pushed, pulled,
copied, and created, which was previously disallowed because the Ollama
registry was backed by a Docker registry that enforced a naming
convention that disallowed mixed-case names, which is no longer the
case.

This does not break existing, intended, behaviors.

Also, make TestCase test a story of creating, updating, pulling, and
copying a model with case variations, ensuring the model's manifest is
updated correctly, and not duplicated across different files with
different case variations.

4b8a2e34

05 Nov, 2024 1 commit

One corrupt manifest should not wedge model operations (#7515) · a4c70fe1

Daniel Hiltgen authored Nov 05, 2024

One potential failure mode is an empty file which bubbles up as an EOF error,
leading to all pulls and listing operations failing. Instead, continue and
warn about the corrupt manifest. This also allows re-pulling the corrupt
manifest to repair the system.

a4c70fe1

04 Nov, 2024 1 commit
- Quiet down debug log of image payload (#7454) · 4ebfa2cb
  Daniel Hiltgen authored Nov 04, 2024
```
Avoid excessive log spew and make consistent with chat logging
```
  4ebfa2cb
30 Oct, 2024 1 commit

runner.go: Better abstract vision model integration · c826e574

Jesse Gross authored Oct 11, 2024



-Update mllama to take the cross attention state as embeddings in
a batch, more similar to how Llava handles it. This improves
integration with the input cache.
-Pass locations in a prompt for embeddings using tags similar to Llava.
-Abstract interface to vision models so the main runner accesses Clip
and Mllama similarly
Co-authored-by: Michael Yang <mxyng@pm.me>

c826e574

28 Oct, 2024 1 commit
- add mllama image processing to the generate handler (#7384) · 084929c2
  Patrick Devine authored Oct 28, 2024
  
  084929c2
18 Oct, 2024 1 commit

image processing for llama3.2 (#6963) · c7cb0f06

Patrick Devine authored Oct 18, 2024


Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>

c7cb0f06

17 Oct, 2024 1 commit
- Rename gpu package discover (#7143) · 05cd82ef
  Daniel Hiltgen authored Oct 16, 2024
```
Cleaning up go package naming
```
  05cd82ef
01 Oct, 2024 1 commit
- Stop model before deletion if loaded (fixed #6957) (#7050) · f40bb398
  Alex Mavrogiannis authored Oct 01, 2024
  
  f40bb398
12 Sep, 2024 1 commit

Optimize container images for startup (#6547) · cd5c8f64

Daniel Hiltgen authored Sep 12, 2024

* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release

cd5c8f64

11 Sep, 2024 1 commit
- add "stop" command (#6739) · abed273d
  Patrick Devine authored Sep 11, 2024
  
  abed273d
27 Aug, 2024 1 commit
- server: clean up route names for consistency (#6524) · 47fa0839
  Jeffrey Morgan authored Aug 26, 2024
  
  47fa0839
13 Aug, 2024 1 commit
- Load Embedding Model on Empty Input (#6325) · 8b00a415
  royjhan authored Aug 13, 2024
```
* load on empty input

* no load on invalid input
```
  8b00a415
11 Aug, 2024 1 commit

server: parallelize embeddings in API web handler instead of in subprocess runner (#6220) · 15c2d8fe

Jeffrey Morgan authored Aug 11, 2024

For simplicity, perform parallelization of embedding requests in the API handler instead of offloading this to the subprocess runner. This keeps the scheduling story simpler as it builds on existing parallel requests, similar to existing text completion functionality.

15c2d8fe

07 Aug, 2024 1 commit

manifest: Fix crash on startup when trying to clean up unused files (#5840) · 1829fb61

Jesse Gross authored Aug 05, 2024

Currently if the config field is missing in the manifest file (or
corrupted), Ollama will crash when it tries to read it. This can
happen at startup or when pulling new models.

This data is mostly just used for showing model information so we
can be tolerant of it not being present - it is not required to
run the models. Besides avoiding crashing, this also gives us the
ability to restructure the config in the future by pulling it
into the main manifest file.

1829fb61

02 Aug, 2024 1 commit
- lint · b732beba
  Michael Yang authored Aug 01, 2024
  
  b732beba
01 Aug, 2024 5 commits
- Refactor and format code. · 8a9f946c
  Vyacheslav Moskalev authored Aug 02, 2024
  
  8a9f946c
- Refactor code. Remove extra variable. · 3b521054
  Vyacheslav Moskalev authored Aug 01, 2024
  
  3b521054
- Better types and naming closer to style. · b0c21658
  Vyacheslav Moskalev authored Aug 01, 2024
  
  b0c21658
- Change the order of context and prompt. · 49a54831
  Vyacheslav Moskalev authored Aug 01, 2024
  
  49a54831
- Fix extra context concatenation in generate handler (#5980). · 6bc5c137
  Vyacheslav Moskalev authored Aug 01, 2024
  
  6bc5c137
30 Jul, 2024 1 commit

Add Metrics to `api\embed` response (#5709) · 1b44d873

royjhan authored Jul 30, 2024

* add prompt tokens to embed response

* rm slog

* metrics

* types

* prompt n

* clean up

* reset submodule

* update tests

* test name

* list metrics

1b44d873

26 Jul, 2024 1 commit
- include modelfile messages · 15af5584
  Michael Yang authored Jun 19, 2024
  
  15af5584
22 Jul, 2024 5 commits
- fix dupe err message (#5857) · db0968f3
  Josh authored Jul 22, 2024
  
  db0968f3
- bool · 55cd3ddc
  Michael Yang authored Jul 03, 2024
  
  55cd3ddc
- origins · d1a5227c
  Michael Yang authored Jul 03, 2024
  
  d1a5227c
- rfc: dynamic environ lookup · 35b89b2e
  Michael Yang authored Jul 03, 2024
  
  35b89b2e
- server: collect nested tool call objects when parsing (#5824) · b3e5491e
  Jeffrey Morgan authored Jul 22, 2024
  
  b3e5491e