Commits · 088514bbd48d3e60ed3fa89d27740b070ab1a4ff · OpenDAS / ollama

17 Mar, 2025 1 commit
- readme: add Ellama to list of community integrations (#9800) · 088514bb
  zeo authored Mar 17, 2025
  
  088514bb
15 Mar, 2025 3 commits

fix: correctly save in interactive mode (#9788) · 2c8b4846

Patrick Devine authored Mar 15, 2025

This fixes the case where a FROM line in previous modelfile points to a
file which may/may not be present in a different ollama instance. We
shouldn't be relying on the filename though and instead just check if
the FROM line was instead a valid model name and point to that instead.

2c8b4846

server/internal/client/ollama: set User-Agent for registry client (#9775) · 82946761

Blake Mizerany authored Mar 14, 2025

This sets the agent header in DefaultRegistry to include the version of
the client, OS, and architecture in the previous format, with a minor
twist.

Note: The version is obtained from the build info, instead of the
version in version.Version, which should not longer be necessary, but we
can remove in a future commit. Using the build info is more accurate and
also provides extra build information if the build is not tagged, and if
it is "dirty". Previously, the version was just "0.0.0" with no other
helpful information. The ollama.com registry and others handle this
swimmingly.

82946761

gemma3 quantization (#9776) · ef378ad6
Patrick Devine authored Mar 14, 2025

ef378ad6

14 Mar, 2025 7 commits

Align versions for local builds (#9635) · 2d2247e5
Daniel Hiltgen authored Mar 14, 2025
```
Darwin was using a different pattern for the version string
than linux or windows.
```
2d2247e5

gemma3: Allow multiple image in a single input · 7bf793a6

Jesse Gross authored Mar 12, 2025

Previously processing multiple images in a batch would trigger
segfaults so sending images together was disabled as a way to
mitigate this. The trigger was processing one image on the CPU
and one on the GPU.

This can no longer happen:
 - The vision encoder is now on the GPU so both images would be
   processed on the GPU.
 - We require images to be fully contained in a batch and each
   image including its special tokens is over half the batch size.
   As a result, we will never get two images in the same batch.

Fixes #9731

7bf793a6

ollamarunner: Use a separate context per multimodal input · 282bfaaa

Jesse Gross authored Mar 13, 2025

Currently there is a single context per sequence, shared all by
all multimodal inputs. Since we build a vision encoder graph per
image, with a large number of inputs we can eventually hit the
maximum number of graph nodes per context.

This changes to use a separate context for each image, ensuring
that available resource limits are consistent.

282bfaaa

ml: Allow models to constrain inputs to a single batch · 9679f401

Jesse Gross authored Mar 12, 2025

Models may require that a set of inputs all be processed as part
of the same batch. For example, if an image has multiple patches
with fully connected attention between them, we should not split
the batch in the middle of an image.

Fixes #9697

9679f401

llm: remove internal subprocess req and resp types (#9324) · 3892c3a7

Bruce MacDonald authored Mar 14, 2025

This commit refactors the LLM subsystem by removing internal subprocess
request and response types. It consolidates duplicate type definitions
across the codebase, moving them to centralized locations. The change also
standardizes interfaces between components, simplifies the ServerStatusResp
struct, and moves the ParseDurationMs function to a common package. This
cleanup reduces code duplication between different runner implementations
(llamarunner and ollamarunner).

3892c3a7

server/internal/chunks: remove chunks package (#9755) · 4e320b8b
Blake Mizerany authored Mar 14, 2025

4e320b8b

server/internal/client: use chunksums for concurrent blob verification (#9746) · eb2b22b0

Blake Mizerany authored Mar 13, 2025

Replace large-chunk blob downloads with parallel small-chunk
verification to solve timeout and performance issues. Registry users
experienced progressively slowing download speeds as large-chunk
transfers aged, often timing out completely.

The previous approach downloaded blobs in a few large chunks but
required a separate, single-threaded pass to read the entire blob back
from disk for verification after download completion.

This change uses the new chunksums API to fetch many smaller
chunk+digest pairs, allowing concurrent downloads and immediate
verification as each chunk arrives. Chunks are written directly to their
final positions, eliminating the entire separate verification pass.

The result is more reliable downloads that maintain speed throughout the
transfer process and significantly faster overall completion, especially
over unstable connections or with large blobs.

eb2b22b0

13 Mar, 2025 17 commits
- Merge pull request #9703 from ollama/mxyng/gemma3-memory · 4ea4d2b1
  Michael Yang authored Mar 13, 2025
```
count gemma3 vision tensors
```
  4ea4d2b1
- count non-repeating vision layers · 8d76fa23
  Michael Yang authored Mar 13, 2025
  
  8d76fa23
- docs: Add OLLAMA_ORIGINS for browser extension support (#9643) · 74b44fdf
  Bradley Erickson authored Mar 13, 2025
  
  74b44fdf
- fix divide by zero · 65b88c54
  Michael Yang authored Mar 13, 2025
  
  65b88c54
- roughly count gemma3 graph · a422ba39
  Michael Yang authored Mar 13, 2025
```
the largest operation is by far (q @ k) so just count that for
simplicity
```
  a422ba39
- count all vision tensors · d2ec2237
  Michael Yang authored Mar 12, 2025
  
  d2ec2237
- count gemma3 vision tensors · 033cec23
  Michael Yang authored Mar 12, 2025
  
  033cec23
- Merge pull request #9741 from ollama/mxyng/visionless · 543240fb
  Michael Yang authored Mar 13, 2025
```
fix: error if image requested without vision model
```
  543240fb
- add verbose mode to the show command (#9640) · 4bed7392
  Patrick Devine authored Mar 13, 2025
```
Add metadata and tensor information to the show command to be able to
see more information about a model. This outputs the same data as
shown on the model details page on ollama.com
```
  4bed7392
- fix: change default context size for gemma3 (#9744) · 80c7ce38
  Patrick Devine authored Mar 13, 2025
  
  80c7ce38
- Merge pull request #9742 from ollama/mxyng/engine-error-embeddings · ccfd41c4
  Michael Yang authored Mar 13, 2025
```
fix: error on models that don't support embeddings
```
  ccfd41c4
- Update model/model.go · 3e102b7d
  Michael Yang authored Mar 13, 2025
```
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
```
  3e102b7d
- engine: error on embeddings; not currently implemented · ec46f328
  Michael Yang authored Mar 13, 2025
  
  ec46f328
- fix: error if image requested without vision model · 5e2e0b46
  Michael Yang authored Mar 13, 2025
  
  5e2e0b46
- Merge pull request #9688 from Shane-XB-Qian/debug_mistype_lld · 45a13b1d
  Michael Yang authored Mar 13, 2025
```
ollama-debug.c: correct mistype
```
  45a13b1d
- sample: separate softmax and temperature transforms (#9732) · 5c0b6639
  Parth Sareen authored Mar 13, 2025
  
  5c0b6639
- ollama-debug.c: change 'ld' to 'PRIi64' · 30d7a59b
  shane.xb.qian authored Mar 13, 2025
```
* macOS has different definition per info from @mxyng
```
  30d7a59b
12 Mar, 2025 8 commits
- sample: do all sorting in topK · 4aeb67ef
  ParthSareen authored Mar 12, 2025
  
  4aeb67ef
- sample: simplify top_k=0 sorting · 3ba91634
  ParthSareen authored Mar 12, 2025
  
  3ba91634
- sample: use container/heap for top_k · 1b7433b7
  ParthSareen authored Mar 12, 2025
  
  1b7433b7
- models/gemma3: remove final logit softcap (#9692) · a70820da
  Bruce MacDonald authored Mar 12, 2025
```
Softcap isn't in the whitepaper/implementation for the language model so we should remove it. There is no discernible difference in output with it removed.
```
  a70820da
- cli: adding support ctrl-n/p like general cli (#9136) · 6b45b1d6
  Shane-XB-Qian authored Mar 12, 2025
```
Signed-off-by: shane.xb.qian <shane.qian@foxmail.com>
```
  6b45b1d6
- ollama-debug.c: correct mistype · 85ab5520
  shane.xb.qian authored Mar 12, 2025
```
Signed-off-by: shane.xb.qian <shane.qian@foxmail.com>
```
  85ab5520
- cli: don't exit for invalid model during /load. (#9576) · b3af953a
  frob authored Mar 12, 2025
```
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
```
  b3af953a
- Adding Gemma 3 to readme (#9671) · ad4e0bf3
  Michael authored Mar 12, 2025
  
  ad4e0bf3
11 Mar, 2025 4 commits
- Merge pull request #9661 from ollama/gemma · aee28501
  Michael Yang authored Mar 11, 2025
```
engine: add gemma support
```
  aee28501
- all: address linter errors · 83f0ec82
  jmorganca authored Mar 11, 2025
  
  83f0ec82
- kvcache: fix tests by adding AvgPool2D stub · c6b6938b
  jmorganca authored Mar 11, 2025
  
  c6b6938b
- model: add more spm tokenizer tests · fb4664fc
  jmorganca authored Mar 11, 2025
  
  fb4664fc