- 23 Jun, 2025 2 commits
-
-
Daniel Hiltgen authored
* Re-remove cuda v11 Revert the revert - drop v11 support requiring drivers newer than Feb 23 This reverts commit c6bcdc42. * Simplify layout With only one version of the GPU libraries, we can simplify things down somewhat. (Jetsons still require special handling) * distinct sbsa variant for linux arm64 This avoids accidentally trying to load the sbsa cuda libraries on a jetson system which results in crashes. * temporary prevent rocm+cuda mixed loading
-
AJ authored
-
- 20 Jun, 2025 4 commits
-
-
Daniel Hiltgen authored
Enable parallel building of the GPU architectures.
-
Michael Yang authored
-
Michael Yang authored
* Reapply "feat: incremental gguf parser (#10822)" (#11114) This reverts commit a6e64fbd. * fix older ggufs
-
Jesse Gross authored
We don't check the return status after computing the graph, which can silently lead to bad outputs if we try to keep going and future computation succeeds. This appears to happens in certain cases on Apple M2 devices. Fixes #11070
-
- 19 Jun, 2025 1 commit
-
-
Daniel Hiltgen authored
Verified these fail on 0.9.1 and pass on HEAD.
-
- 18 Jun, 2025 6 commits
-
-
Jeffrey Morgan authored
Removes a test under benchmark/ that is unused
-
Jeffrey Morgan authored
Reverts PR #11115. The original change was mistakingly reverted instead of #10822
-
Jeffrey Morgan authored
This reverts commit aaa78180.
-
Jeffrey Morgan authored
This reverts commit 6b04cad7.
-
曹家巧 authored
-
Jeffrey Morgan authored
-
- 17 Jun, 2025 1 commit
-
-
Jeffrey Morgan authored
Fixes issue where tool calls that don't expect any parameters were not being parsed. This also fixes two additional issues: one where 2+ tool calls would not be correctly parsed, and cases where tool calls with invalid parameters would still get parsed
-
- 16 Jun, 2025 3 commits
-
-
Jeffrey Morgan authored
-
Michael Yang authored
* ggml: test write gguf order * ggml: fix write tensor order
-
NGC13009 authored
-
- 14 Jun, 2025 1 commit
-
-
Phil authored
-
- 12 Jun, 2025 2 commits
-
-
Jeffrey Morgan authored
-
Michael Yang authored
* incremental gguf parser * gguf: update test to not rely on gguf on disc * re-use existing create gguf * read capabilities from gguf kv * kv exists * update tests * s/doneFunc/successFunc/g * new buffered reader --------- Co-authored-by:Bruce MacDonald <brucewmacdonald@gmail.com>
-
- 11 Jun, 2025 3 commits
-
-
Michael Yang authored
The current splitDim function only operates on tensors that are split evenly which isn't always the case, e.g. a QKV tensor. This change allows the function to be used for arbitrary splits
-
Michael Yang authored
if tokenizer.json is already copied, skip tokenizer.model
-
Michael Yang authored
while nn.Linear.Forward isn't applicable for sparse MLP, it's still a nice container for the tensors
-
- 10 Jun, 2025 3 commits
-
-
Attogram Project authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 09 Jun, 2025 1 commit
-
-
Daniel Hiltgen authored
When a user elects to keep the existing app, the new Ollama is named `Ollama 2.app` This fixes the app startup flow to handle this naming pattern.
-
- 08 Jun, 2025 1 commit
-
-
Daniel Hiltgen authored
Give the desktop app a hint to start fast.
-
- 07 Jun, 2025 2 commits
-
-
Krzysztof Jeziorny authored
-
Jeffrey Morgan authored
This reverts commit 09430011.
-
- 06 Jun, 2025 4 commits
-
-
Daniel Hiltgen authored
When starting the app in the background, start it hidden.
-
Daniel Hiltgen authored
Fix an array out of bounds crash
-
Devon Rifkin authored
move thinking logic into its own package
-
Hunter Wittenborn authored
-
- 05 Jun, 2025 2 commits
-
-
Devon Rifkin authored
export ThinkingParser
-
Devon Rifkin authored
-
- 04 Jun, 2025 1 commit
-
-
JasonHonKL authored
-
- 31 May, 2025 1 commit
-
-
HardCodeDev authored
-
- 30 May, 2025 1 commit
-
-
Parth Sareen authored
-
- 29 May, 2025 1 commit
-
-
Jesse Gross authored
This enables matching up devices and information reported by the backend with system management libraries such as nvml to get accurate free memory reporting.
-