- 14 Mar, 2025 2 commits
-
-
Blake Mizerany authored
-
Blake Mizerany authored
Replace large-chunk blob downloads with parallel small-chunk verification to solve timeout and performance issues. Registry users experienced progressively slowing download speeds as large-chunk transfers aged, often timing out completely. The previous approach downloaded blobs in a few large chunks but required a separate, single-threaded pass to read the entire blob back from disk for verification after download completion. This change uses the new chunksums API to fetch many smaller chunk+digest pairs, allowing concurrent downloads and immediate verification as each chunk arrives. Chunks are written directly to their final positions, eliminating the entire separate verification pass. The result is more reliable downloads that maintain speed throughout the transfer process and significantly faster overall completion, especially over unstable connections or with large blobs.
-
- 13 Mar, 2025 17 commits
-
-
Michael Yang authored
count gemma3 vision tensors
-
Michael Yang authored
-
Bradley Erickson authored
-
Michael Yang authored
-
Michael Yang authored
the largest operation is by far (q @ k) so just count that for simplicity
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
fix: error if image requested without vision model
-
Patrick Devine authored
Add metadata and tensor information to the show command to be able to see more information about a model. This outputs the same data as shown on the model details page on ollama.com
-
Patrick Devine authored
-
Michael Yang authored
fix: error on models that don't support embeddings
-
Michael Yang authored
Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
ollama-debug.c: correct mistype
-
Parth Sareen authored
-
shane.xb.qian authored
* macOS has different definition per info from @mxyng
-
- 12 Mar, 2025 8 commits
-
-
ParthSareen authored
-
ParthSareen authored
-
ParthSareen authored
-
Bruce MacDonald authored
Softcap isn't in the whitepaper/implementation for the language model so we should remove it. There is no discernible difference in output with it removed.
-
Shane-XB-Qian authored
Signed-off-by:shane.xb.qian <shane.qian@foxmail.com>
-
shane.xb.qian authored
Signed-off-by:shane.xb.qian <shane.qian@foxmail.com>
-
frob authored
Co-authored-by:Richard Lyons <frob@cloudstaff.com>
-
Michael authored
-
- 11 Mar, 2025 13 commits
-
-
Michael Yang authored
engine: add gemma support
-
jmorganca authored
-
jmorganca authored
-
jmorganca authored
-
jmorganca authored
-
Michael Yang authored
-
Daniel Hiltgen authored
-
jmorganca authored
-
jmorganca authored
-
jmorganca authored
This reverts commit c7eae586b899083acebcd9b3847b89ea78c2850c.
-
Jesse Gross authored
This is useful for a few things: - Work around bugs, such as having 2 images in one batch - Keep the image in a single batch for fully connected attention - Improve performance by not evaluating embeddings multiple times
-
Jesse Gross authored
Currently we are using positions, which are relative to a sequence and may not be unique.
-
Jesse Gross authored
-