Commits · c6bcdc4223c50071b59a19c42cc54ec9932f696f · OpenDAS / ollama

13 May, 2025 4 commits
- Revert "remove cuda v11 (#10569)" (#10692) · c6bcdc42
  Daniel Hiltgen authored May 13, 2025
```
Bring back v11 until we can better warn users that their driver
is too old.

This reverts commit fa393554.
```
  c6bcdc42
- llama: fix crash on snowflake embedding model (#10690) · 4b903f08
  Jeffrey Morgan authored May 13, 2025
  
  4b903f08
- server: add webp image input support (#10653) · c7f4ae7b
  Jeffrey Morgan authored May 12, 2025
  
  c7f4ae7b
- fix vocabulary (#10679) · 526b2ed1
  Michael Yang authored May 12, 2025
  
  526b2ed1
12 May, 2025 5 commits
- models: remove unused qwen2vl processing (#10677) · a7240c6d
  Bruce MacDonald authored May 12, 2025
  
  a7240c6d
- Follow up to #10363 (#10647) · 9d6df908
  Daniel Hiltgen authored May 12, 2025
```
The quantization PR didn't block all unsupported file types,
which this PR fixes.  It also updates the API docs to reflect
the now reduced set of supported types.
```
  9d6df908
- llama: update to commit de4c07f93 (#10655) · 0cefd46f
  Jeffrey Morgan authored May 12, 2025
  
  0cefd46f
- convert: quantize from safetensors needs kv (#10675) · ad035ad5
  Bruce MacDonald authored May 12, 2025
```
When creating a quantized model from safetensors we
need the array KV values to be loaded.Changing this
value to -1 loads the KV values on the returned
layer to be used and saved during quantization.
```
  ad035ad5
- feat: add trace log level (#10650) · f95a1f2b
  Michael Yang authored May 12, 2025
```
reduce prompt log to trace level
```
  f95a1f2b
11 May, 2025 2 commits
- readme: add UnityCodeLama to community integrations (#10665) · 82a9e946
  HardCodeDev authored May 12, 2025
  
  82a9e946
- readme: add OllamaPlusPlus C++ library to community integrations (#10664) · 76724e2f
  HardCodeDev authored May 12, 2025
  
  76724e2f
10 May, 2025 5 commits
- llama: allocate grammar buffer based on schema length (#10649) · ecf14a22
  frob authored May 10, 2025
  
  ecf14a22
- envconfig: Remove no longer supported max vram var (#10623) · 69ce44b3
  frob authored May 10, 2025
```
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
  69ce44b3
- feat: add threshold to dump options (#10639) · 5969674c
  Michael Yang authored May 10, 2025
```
ml.Dump will preserve default values if not specified
```
  5969674c
- readme: add ojira to community integrations (#10648) · 867d75b2
  AliAhmedNada authored May 10, 2025
  
  867d75b2
- cmd: strip single quotes from image page (#10636) · 3fa78598
  Bruce MacDonald authored May 09, 2025
  
  3fa78598
08 May, 2025 5 commits

fix: stream accumulator exits early (#10593) · 0d6e35d3

Michael Yang authored May 08, 2025

the stream accumulator exits as soon as it sees `api.ProgressResponse(status="success")` which isn't strictly correctly
since some requests may have multiple successes, e.g. `/api/create` when the source model needs to be pulled.

0d6e35d3

lint: enable usetesting, disable tenv (#10594) · 6e9a7a25
Michael Yang authored May 08, 2025

6e9a7a25
chore: remove unused ZipReader type (#10621) · b585a581
Michael Yang authored May 08, 2025

b585a581
api: remove unused sampling parameters (#10581) · fa9973cd
Jeffrey Morgan authored May 08, 2025

fa9973cd

ollamarunner: Use correct constant to remove cache entries · 3d9498a4

Jesse Gross authored May 07, 2025

The correct constant to remove all entries to the end of the sequence
for the Ollama engine is math.MaxInt32. -1 is used by the old engine.

The impact of this is currently minimal because it would only occur
in situations that are not supported by the implemented models or
rarely used options.

3d9498a4

07 May, 2025 5 commits

CI: trigger downstream release process (#10508) · 3098c8b2
Daniel Hiltgen authored May 07, 2025

3098c8b2

sched: fix race leading to orphaned runners (#10599) · 5e380c3b

Daniel Hiltgen authored May 07, 2025

If a model is loading, and the request context is canceled during the load
by a client closing the connection, and another request is inbound for the
same model with a different configuration (context size, etc.) thus requiring
a reload, two unload events can be in flight. The first shuts down the
original model load, but the second one caused the loss of the new
reloading runner reference, thus triggering the leak.

The primary fix is detecting the duplicate unload and ignoring the second
instance. The load routine is also hardened to ensure we detect
clobbering an already present runner and unload it with a warning.

5e380c3b

api: remove unused RetrieveModelResponse type (#10603) · 392de840
Jeffrey Morgan authored May 06, 2025

392de840
fix data race in WriteGGUF (#10598) · af31ccef
Daniel Hiltgen authored May 06, 2025
```
err in the go routine should not be shared with the outer scope
```
af31ccef

remove cuda v11 (#10569) · fa393554

Daniel Hiltgen authored May 06, 2025

This reduces the size of our Windows installer payloads by ~256M by dropping
support for nvidia drivers older than Feb 2023. Hardware support is unchanged.

Linux default bundle sizes are reduced by ~600M to 1G.

fa393554

06 May, 2025 5 commits
- readme: add Flufy to community integrations (#9719) · 307e3b3e
  Aharon Bensadoun authored May 07, 2025
  
  307e3b3e
- server: send 405 instead of 404 for unallowed methods (#10275) · 4090aca9
  Devon Rifkin authored May 06, 2025
```
Fixes: #5483
```
  4090aca9
- server: remove internal cmd (#10595) · 92ce438d
  Michael Yang authored May 06, 2025
  
  92ce438d
- Move quantization to new backend (#10363) · 42481045
  Daniel Hiltgen authored May 06, 2025
```
* Move quantization logic to GGML via new backend

This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.

* Remove "add model quantizations"

This is no longer needed now that quantization is implemented in Go+GGML code directly.
```
  42481045
- discover: fix compiler warnings (#10572) · 95e744be
  Michael Yang authored May 06, 2025
  
  95e744be
05 May, 2025 7 commits
- api: remove unused or unsupported api options (#10574) · 3b2d2c83
  Jeffrey Morgan authored May 05, 2025
```
Some options listed in api/types.go are not supported in
newer models, or have been deprecated in the past. This is
the first of a series of PRs to clean up the API options
```
  3b2d2c83
- create blobs in parallel (#10135) · d931ee8f
  Michael Yang authored May 05, 2025
```
* default max term height
* error on out of tree files
```
  d931ee8f
- ggml: Reduce log level of "key not found" · 70736007
  Jesse Gross authored May 05, 2025
```
Most of the time this is not an error.
```
  70736007
- win: lint fix (#10571) · b1c40138
  Daniel Hiltgen authored May 05, 2025
  
  b1c40138
- Hide empty terminal window (#8668) · 17466217
  Ashok Gelal authored May 05, 2025
```
This hides the LlamaServer blank window when chatting outside of the terminal (say like with an app like Msty). This has no other side effects when invoking it the regular way.
```
  17466217
- server: fix panic when runner.Options is nil (#10566) · 1703d147
  Jeffrey Morgan authored May 05, 2025
  
  1703d147
- all: fix cgo compiler warnings on windows (#10563) · 91390502
  Jeffrey Morgan authored May 05, 2025
  
  91390502
04 May, 2025 1 commit
- file close check and close. (#10554) · 7e5c8eee
  湛露先生 authored May 05, 2025
```
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
```
  7e5c8eee
03 May, 2025 1 commit

win: ensure ollama paths come first (#10549) · 6a74bba7

Daniel Hiltgen authored May 03, 2025

For all search path env vars make sure our dirs are first
to avoid potentially finding other incompatible libraries
on the users system.

Also fixes a minor build script glitch for windows rocm

6a74bba7