Commits · e082d60a2406d54cc8c13d7e408f08818e7939d1 · OpenDAS / ollama

08 Dec, 2025 1 commit

truncation: fixed runner truncation logic + removed server truncation (#12839) · e082d60a

nicole pardal authored Dec 08, 2025

This PR consolidates all embedding prompt-length checking, truncation, and prompt token counting into the runner to ensure a single source of truth.

e082d60a

28 Oct, 2025 2 commits
- embed: add distance correlation test for library embed models (#12796) · 36d64fb5
  Patrick Devine authored Oct 28, 2025
  
  36d64fb5
- Revert "server: Consolidate embedding truncation in runner (#12730)" (#12810) · 29f63f37
  Patrick Devine authored Oct 28, 2025
```
This reverts commit 5d347f6d.
```
  29f63f37
27 Oct, 2025 1 commit

server: Consolidate embedding truncation in runner (#12730) · 5d347f6d

nicole pardal authored Oct 27, 2025

Currently, checking the length of prompts for embeddings to ensure
they fit in the context window (and possible truncation) occurs in
two places - the Ollama server and runner. This can lead to
inconsistencies in both the checks and reported number of tokens
processed. Since we have to do this processing in the runner, this
consolidates all of the logic there.

5d347f6d

20 Oct, 2025 1 commit
- runner: always truncate embeddings requests (#12714) · 5fe7ba1b
  Jeffrey Morgan authored Oct 20, 2025
  
  5fe7ba1b
18 Sep, 2025 1 commit
- fix(integration): check truncated length (#12337) · ceac416e
  Michael Yang authored Sep 18, 2025
  
  ceac416e
09 Sep, 2025 1 commit

tests: reduce stress on CPU to 2 models (#12161) · 67451828

Daniel Hiltgen authored Sep 09, 2025

* tests: reduce stress on CPU to 2 models

This should avoid flakes due to systems getting overloaded with 3 (or more) models running concurrently

* tests: allow slow systems to pass on timeout

If a slow system is still streaming a response, and the response
will pass validation, don't fail just because the system is slow.

* test: unload embedding models more quickly

67451828

29 Apr, 2025 1 commit

integration: fix embedding tests error handling (#10478) · 7bec2724

Daniel Hiltgen authored Apr 29, 2025

The cleanup routine from InitServerconnection should run in the defer of the test case to properly detect failures and report the server logs

7bec2724

22 Oct, 2024 1 commit
- integration: harden embedding test (#7306) · dc6fe820
  Daniel Hiltgen authored Oct 22, 2024
```
Use cosine similarity to make the embeddings tests more robust
```
  dc6fe820
22 Aug, 2024 1 commit

Fix embeddings memory corruption (#6467) · 90ca8417

Daniel Hiltgen authored Aug 22, 2024

* Fix embeddings memory corruption

The patch was leading to a buffer overrun corruption.  Once removed though, parallism
in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
work around this, only use slot 0 for embeddings.

* Fix embed integration test assumption

The token eval count has changed with recent llama.cpp bumps (0.3.5+)

90ca8417

30 Jul, 2024 1 commit

Add Metrics to `api\embed` response (#5709) · 1b44d873

royjhan authored Jul 30, 2024

* add prompt tokens to embed response

* rm slog

* metrics

* types

* prompt n

* clean up

* reset submodule

* update tests

* test name

* list metrics

1b44d873

24 Jul, 2024 1 commit
- Fix Embed Test Flakes (#5893) · ac33aa7d
  royjhan authored Jul 24, 2024
```
* float cmp

* increase tolerance
```
  ac33aa7d
15 Jul, 2024 1 commit

Introduce `/api/embed` endpoint supporting batch embedding (#5127) · b9f5e16c

royjhan authored Jul 15, 2024

* Initial Batch Embedding

* Revert "Initial Batch Embedding"

This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29.

* Initial Draft

* mock up notes

* api/embed draft

* add server function

* check normalization

* clean up

* normalization

* playing around with truncate stuff

* Truncation

* Truncation

* move normalization to go

* Integration Test Template

* Truncation Integration Tests

* Clean up

* use float32

* move normalize

* move normalize test

* refactoring

* integration float32

* input handling and handler testing

* Refactoring of legacy and new

* clear comments

* merge conflicts

* touches

* embedding type 64

* merge conflicts

* fix hanging on single string

* refactoring

* test values

* set context length

* clean up

* testing clean up

* testing clean up

* remove function closure

* Revert "remove function closure"

This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787.

* remove function closure

* remove redundant error check

* clean up

* more clean up

* clean up

b9f5e16c