Commits · 231cc878cba3f5d080bc96faf8ecbded31b0b4e2 · OpenDAS / ollama

17 Nov, 2025 2 commits
- app/ui: fix to point ollama client to ui backend in dev mode (#13079) · 231cc878
  Eva H authored Nov 17, 2025
  
  231cc878
- docs: link to ollama.com instead of hardcoding list of cloud models (#13110) · aa676b31
  Jeffrey Morgan authored Nov 16, 2025
  
  aa676b31
16 Nov, 2025 6 commits
- docs: fix typos in repository documentation (#10683) · dd0ed0ef
  omahs authored Nov 16, 2025
  
  dd0ed0ef
- readme: add Kdeps to community integrations (#11877) · d5649821
  Joel Bryan Juliano authored Nov 16, 2025
```
Kdeps is an AI framework for building Dockerized full-stack AI
applications declaratively and uses Ollama LLM models on the
backend
```
  d5649821
- server: clean up manifest documentation (#12995) · 4cea757e
  pierwill authored Nov 15, 2025
```
Co-authored-by: pierwill <pierwill@users.noreply.github.com>
```
  4cea757e
- llama: test case typo and readability improvements (#13078) · a751bc15
  Vignesh Skanda authored Nov 16, 2025
  
  a751bc15
- discover: fix typos in runner.go (#13096) · 5d31242f
  Laurențiu Nicola authored Nov 16, 2025
  
  5d31242f
- tests: basic benchmarking test framework (#12964) · d7fd7219
  Patrick Devine authored Nov 15, 2025
```
This change adds a basic benchmarking test framework for Ollama which can
be used to determine the prefill, eval, load duration, and total duration
for running a given model or models.
```
  d7fd7219
14 Nov, 2025 2 commits
- log: warn if user overrides detected (#13088) · 72ff5b9d
  Daniel Hiltgen authored Nov 14, 2025
```
Many failed GPU discovery issues recently can be traced to incorrect override settings.
This extra logging should help quickly spot these and guide users to try unsetting them first.
```
  72ff5b9d
- docs: add logprobs to openapi (#13090) · ce29f695
  Parth Sareen authored Nov 14, 2025
  
  ce29f695
13 Nov, 2025 9 commits
- fix tensor merge (#13053) · 12b174b1
  Michael Yang authored Nov 13, 2025
  
  12b174b1
- chore: update models to use slice/chunk/chunksections (#12934) · 333203d8
  Michael Yang authored Nov 13, 2025
```
* use slice/chunks

* bert

* llama4

* gemma3n

* gptoss

* mistral3

* qwen3vl

* qwen25vl

* deepseek2

* remove unused ops
```
  333203d8
- logprob: add bytes to logprobs (#13068) · c1149875
  Parth Sareen authored Nov 13, 2025
  
  c1149875
- ml: add slice operation (#12870) · b48083f3
  Michael Yang authored Nov 13, 2025
```
* slice

* chunk, chunksections
```
  b48083f3
- embeddings: added cli command to embedding docs (#12993) · 482bec82
  nicole pardal authored Nov 13, 2025
  
  482bec82
- docs: fix typo (VSCode -> VS Code) (#13072) · 684a9a8c
  Kowyo authored Nov 13, 2025
  
  684a9a8c
- app: remove source code for previous JavaScript-based macOS app (#13067) · 54a76d37
  Jeffrey Morgan authored Nov 12, 2025
```
The code in this directory has been replaced with the
new Go version in the 'app' directory.
```
  54a76d37
- readme: add AI UI to community integrations (#13035) · 8a75d8b0
  Radhi authored Nov 13, 2025
  
  8a75d8b0
- readme: fix incorrect header in community integrations (#13065) · f2063574
  Jeffrey Morgan authored Nov 12, 2025
  
  f2063574
12 Nov, 2025 3 commits

ci: fix win vulkan (#13062) · 8224cd90
Daniel Hiltgen authored Nov 12, 2025

8224cd90

Enable Vulkan with a temporary opt-in setting (#12931) · 6286d9a3

Daniel Hiltgen authored Nov 12, 2025

* docs: vulkan information

* Revert "CI: Set up temporary opt-out Vulkan support (#12614)"

This reverts commit 8b6e5bae.

* vulkan: temporary opt-in for Vulkan support

Revert this once we're ready to enable by default.

* win: add vulkan CI build

6286d9a3

vulkan: temporary cary of vulkan fixes (#12971) · 3a9e8e9f
Daniel Hiltgen authored Nov 12, 2025
```
This should be reverted once we update ggml past b6897
```
3a9e8e9f

11 Nov, 2025 14 commits

docs: rename api-reference.md back to api.md since redirect stopped working (#13056) · cb1cb064
Jeffrey Morgan authored Nov 11, 2025

cb1cb064
docs: fix openapi.yaml warnings, rename api.md to api-reference.md (#12904) · 2d5e066c
Jeffrey Morgan authored Nov 11, 2025

2d5e066c

docs/openapi: document that delete and copy responses are empty (#13055) · 15968714

Bruce MacDonald authored Nov 11, 2025

Some route endpoints return an empty response with a 200 OK. These should be documented in the OpenAPI doc. Note that the previous deletion response was not correct.

15968714

llm: Prefer dedicated GPUs over iGPUs when allocating memory · 8bf38552

Jesse Gross authored Nov 04, 2025

We currently assign model layers to GPUs according to free VRAM,
which assumes that GPU performance is roughly equal. This does not
work well for mixed dGPU and iGPU systems because iGPUs typically
use system memory which is large but their performance is slow.
This instead assigns layers to dGPUs first and then iGPUs.

In the future, this could be generalized to have a more fine grained
notion of GPU performance but dGPU vs. iGPU performance is the most
extreme.

8bf38552

llm: Separate llamaServer and ollamaServer code paths · b13fbad0

Jesse Gross authored Nov 06, 2025

Originally, llamaServer represented old memory estimates, which
could be used with either the old or new engine. ollamaServer was
used only for the new estimates and new engine. Since these
implementations did not map directly to engine, there was engine-
specific code in common code paths.

Now that new estimates are always used for the new engine, there is
a direct mapping between server type and engine. This separates out
most of the engine-specific code into the correct implementation
to make things easier to understand.

b13fbad0

llm: Use Ollama engine memory layouts for both old and new engines · f560bd07

Jesse Gross authored Nov 05, 2025

Currently for both the old and new engines, there is code to
calculate how much memory is required for a model and lay out
the layers onto GPUs. This reuses the new engine's lay out code
for the old engine as well, bringing them closer together. The
old engine continues to use its current method of estimating
required memory.

This reduces maintainence effort and improves consistency, as new
features only need to be implemented in one place. The newer code
is also more accurate, especially with multiple GPUs.

f560bd07

llamarunner: Respect device ordering for offloaded layers · 4372d0bf

Jesse Gross authored Nov 10, 2025

We used to control the way that llama.cpp saw devices using
CUDA_VISIBLE_DEVICES or similar. This would ensure that the layers
offloaded to a device were actually the ones intended. This is
particularly important because we might reorder devices based on
free memory or performance.

When we started explicitly scheduling layers, this logic went
away but the llamarunner didn't have any way to set the correct
order of devices. This meant that the correct number of layers
would be assigned to a device but not necessarily the layers
that were expected. This change sets up the devices correctly
based on the offload information.

4372d0bf

app/ui: do not send thinking to prevent errors with cloud provider · 31361c4d
Eva H authored Nov 11, 2025

31361c4d

server: add logprobs and top_logprobs support to Ollama's API (#12899) · 59241c5b

Baptiste Jamin authored Nov 11, 2025



Adds logprobs support to Ollama's API including support for Ollama's
OpenAI-compatible API. By specifying the new 'logprobs' boolean parameter
in the API, Ollama will return the log probabilities for each token generated.
'top_logprobs', an integer value can also be specified up to the value 20.
When specified, the API will also provide the number of most likely tokens to
return at each token position
Co-authored-by: Baptiste Jamin <baptiste@crisp.chat>

59241c5b

address comment · 2a9b61f0
Eva Ho authored Nov 11, 2025

2a9b61f0
docs: fix metal gpu section header (#13045) · 6df42088
Sheikh authored Nov 11, 2025

6df42088
fix test · 9d615cda
Eva Ho authored Nov 10, 2025

9d615cda
clean up · 6a818b8a
Eva Ho authored Nov 10, 2025

6a818b8a
app/ui: do not send to prevent errors with cloud provider · 2aaf29ac
Eva Ho authored Nov 10, 2025

2aaf29ac

10 Nov, 2025 1 commit
- app/ui: using streamdown AI elements for markdown rendering · a42f826a
  Eva H authored Nov 10, 2025
  
  a42f826a
08 Nov, 2025 3 commits
- app/docs: remove out of date storybook instructions (#13006) · e10a3533
  Bruce MacDonald authored Nov 08, 2025
  
  e10a3533
- bugfix: don't include both consolidated.safetensors and model-*.safetensors (#13010) · 91ec3ddb
  Patrick Devine authored Nov 07, 2025
  
  91ec3ddb
- docs: update n8n URL for Ollama (#12994) · 755ac3b0
  Parth Sareen authored Nov 07, 2025
  
  755ac3b0