Commits · 34088dbcfb47546fc0f375276173467bc8bbed29 · OpenDAS / ollama

08 Jul, 2025 1 commit

API/CLI context enhancements (#11331) · 34088dbc

Daniel Hiltgen authored Jul 08, 2025

* API: expose context size of loaded models

* CLI: add context UX

This adds a column in the ps output to show the models context size.

34088dbc

07 Jul, 2025 4 commits
- add `tool_name` to api.md (#11326) · 43107b15
  Parth Sareen authored Jul 07, 2025
  
  43107b15
- template: add tool result compatibility (#11294) · 1f91cb0c
  Parth Sareen authored Jul 07, 2025
  
  1f91cb0c
- ci: modularization (#11324) · 12d8ad0d
  Daniel Hiltgen authored Jul 07, 2025
```
switch a few constants to variables
```
  12d8ad0d
- Revert "ggml: Temporarily disable reporting UUIDs" · 592d21e7
  Jesse Gross authored Jun 27, 2025
```
The root cause was an unclean upgrade - this code is fine.

This reverts commit 45f216a9.
```
  592d21e7
06 Jul, 2025 1 commit
- readme: update Ollama icon size · 5a08b01f
  Jeffrey Morgan authored Jul 05, 2025
  
  5a08b01f
05 Jul, 2025 3 commits
- int: add performance integration tests (#11173) · 4f473e22
  Daniel Hiltgen authored Jul 05, 2025
```
usage example:
  go test --tags=integration,perf -count 1 ./integration -v -timeout 1h -run TestModelsPerf 2>&1 | tee int.log
  cat int.log | grep MODEL_PERF_HEADER | cut -f2- -d: > perf.csv
  cat int.log | grep MODEL_PERF_DATA | cut -f2- -d: >> perf.csv
```
  4f473e22
- doc: add NVIDIA blackwell to supported list (#11307) · 9d60bb44
  Daniel Hiltgen authored Jul 05, 2025
  
  9d60bb44
- Update base image to Ubuntu 24.04 LTS (#9681) · f371260e
  Vincent RAMPAL authored Jul 06, 2025
  
  f371260e
03 Jul, 2025 1 commit
- doc: Update link for mac install (#11288) · c9e6d771
  Daniel Hiltgen authored Jul 03, 2025
```
Favor the dmg now.
```
  c9e6d771
02 Jul, 2025 1 commit
- mimic logs for layers on new engine (#11278) · 2c4ce403
  Daniel Hiltgen authored Jul 02, 2025
```
This adds some extra logs to make the new engine a bit more consistent
with the llama engine.
```
  2c4ce403
01 Jul, 2025 1 commit
- readme: add NativeMind to community integrations (#11242) · 5d8c1735
  XuKecheng authored Jul 02, 2025
  
  5d8c1735
30 Jun, 2025 1 commit
- tools: fix parsing tool calls with empty arguments, missing required fields (#11233) · 44b17d2b
  Jeffrey Morgan authored Jun 30, 2025
  
  44b17d2b
29 Jun, 2025 1 commit
- readme: add ollama-bash-toolshed to community integrations (#11224) · 3b8b6922
  Attogram Project authored Jun 29, 2025
  
  3b8b6922
27 Jun, 2025 3 commits
- chore: cleanup comments + unused vars (#11225) · 4129af92
  Michael Yang authored Jun 27, 2025
  
  4129af92
- ggml: Temporarily disable reporting UUIDs · 45f216a9
  Jesse Gross authored Jun 27, 2025
```
This is causing segfaults, so disable it. Currently UUIDs are only
used for debugging purposes, although they planned to be used in
additional ways in the future.

Bug #11211
```
  45f216a9
- skip quantizing per_layer_token_embd (#11207) · d0b32def
  Michael Yang authored Jun 26, 2025
```
this tensor isn't compatible with cuda when quantized to q4_K so skip it
```
  d0b32def
26 Jun, 2025 4 commits
- ci: multi-stage release process (#11001) · 11ffc361
  Daniel Hiltgen authored Jun 26, 2025
  
  11ffc361
- fs/ggml: add multiplier in graph estimates (#11208) · ba049026
  Jeffrey Morgan authored Jun 26, 2025
  
  ba049026
- fs/ggml: add missing architecture to OllamaEngineRequired() (#11206) · 3944602f
  Jeffrey Morgan authored Jun 26, 2025
  
  3944602f
- add new gemma model (#11204) · 73b642e6
  Michael Yang authored Jun 25, 2025
```
* update patches

* cherry pick metal mean kernel

* cherry pick cuda mean kernel

* gemma3n
```
  73b642e6
25 Jun, 2025 5 commits
- ci: arm sbsa fixes (#11194) · ad118d8b
  Daniel Hiltgen authored Jun 24, 2025
  
  ad118d8b
- ci: include dependencies · f0853413
  Daniel Hiltgen authored Jun 24, 2025
  
  f0853413
- ci: pick up arm sbsa cuda libs (#11192) · 4b4a90f2
  Daniel Hiltgen authored Jun 24, 2025
  
  4b4a90f2
- ci: recombine linux amd64 binaries (#11188) · 03274a6b
  Daniel Hiltgen authored Jun 24, 2025
```
Glue the rocm and archive builds back together.
```
  03274a6b
- Merge pull request #10238 from ollama/drifkin/array-head-count-simple · cc6463eb
  Devon Rifkin authored Jun 24, 2025
```
ggml: fix crash for array head counts
```
  cc6463eb
24 Jun, 2025 3 commits
- ci: rocm parallel builds on windows (#11187) · 405d2f62
  Daniel Hiltgen authored Jun 24, 2025
```
The preset CMAKE_HIP_FLAGS isn't getting used on Windows.
This passes the parallel flag in through the C/CXX flags, along
with suppression for some log spew warnings to quiet down the build.
```
  405d2f62
- Merge branch 'main' into drifkin/array-head-count-simple · a3f7dd3e
  Devon Rifkin authored Jun 24, 2025
  
  a3f7dd3e
- CI: switch windows to vs 2022 (#11184) · c85c0ebf
  Daniel Hiltgen authored Jun 24, 2025
```
* CI: switch windows to vs 2022

* ci: fix regex match
```
  c85c0ebf
23 Jun, 2025 4 commits

avoid context overflow (#11175) · 10a8e04a
Daniel Hiltgen authored Jun 23, 2025
```
For smaller context models, make sure we do not exceed the training size.
```
10a8e04a

Re-remove cuda v11 (#10694) · 1c6669e6

Daniel Hiltgen authored Jun 23, 2025

* Re-remove cuda v11

Revert the revert - drop v11 support requiring drivers newer than Feb 23

This reverts commit c6bcdc42.

* Simplify layout

With only one version of the GPU libraries, we can simplify things down somewhat.  (Jetsons still require special handling)

* distinct sbsa variant for linux arm64

This avoids accidentally trying to load the sbsa cuda libraries on
a jetson system which results in crashes.

* temporary prevent rocm+cuda mixed loading

1c6669e6

Merge branch 'main' into drifkin/array-head-count-simple · b2b270ad
Devon Rifkin authored Jun 23, 2025

b2b270ad
readme: add ai-hub to community integrations (#11169) · 2bb69b40
AJ authored Jun 23, 2025

2bb69b40

20 Jun, 2025 4 commits

build speedups (#11142) · 65bff664
Daniel Hiltgen authored Jun 20, 2025
```
Enable parallel building of the GPU architectures.
```
65bff664
convert: utility for merging tensors (#11069) · c088ac0e
Michael Yang authored Jun 20, 2025

c088ac0e
Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) · 0a066cfd
Michael Yang authored Jun 20, 2025
```
* Reapply "feat: incremental gguf parser (#10822)" (#11114)

This reverts commit a6e64fbd.

* fix older ggufs
```
0a066cfd

ggml: Check return status for computation. · 87b7af6c

Jesse Gross authored Jun 19, 2025

We don't check the return status after computing the graph, which
can silently lead to bad outputs if we try to keep going and future
computation succeeds. This appears to happens in certain cases on
Apple M2 devices.

Fixes #11070

87b7af6c

19 Jun, 2025 1 commit
- int: add coverage for older models (#11137) · f2527b08
  Daniel Hiltgen authored Jun 19, 2025
```
Verified these fail on 0.9.1 and pass on HEAD.
```
  f2527b08
18 Jun, 2025 2 commits
- benchmark: remove unused benchmark test (#11120) · 8bcb3125
  Jeffrey Morgan authored Jun 18, 2025
```
Removes a test under benchmark/ that is unused
```
  8bcb3125
- Revert "Revert "ggml: Export GPU UUIDs" (#11115)" (#11117) · 6baf1e31
  Jeffrey Morgan authored Jun 18, 2025
```
Reverts PR #11115. The original change was mistakingly reverted instead of #10822
```
  6baf1e31