- 23 Dec, 2025 1 commit
-
-
Vallabh Mahajan authored
-
- 12 Nov, 2025 1 commit
-
-
Daniel Hiltgen authored
* docs: vulkan information * Revert "CI: Set up temporary opt-out Vulkan support (#12614)" This reverts commit 8b6e5bae. * vulkan: temporary opt-in for Vulkan support Revert this once we're ready to enable by default. * win: add vulkan CI build
-
- 11 Nov, 2025 1 commit
-
-
Sheikh authored
-
- 07 Nov, 2025 1 commit
-
-
Daniel Hiltgen authored
* doc: re-add login autostart faq This appears to have been accidentally dropped during the doc migration. * docs: GPU updates lost on the doc update * review comments: improve windows login disable instructions
-
- 28 Oct, 2025 2 commits
-
-
Parth Sareen authored
-
Parth Sareen authored
-
- 16 Oct, 2025 1 commit
-
-
Daniel Hiltgen authored
8.7 is Jetpack only, so no need on x86 builds 10.3 covers [G]B300
-
- 11 Oct, 2025 1 commit
-
-
Daniel Hiltgen authored
-
- 02 Oct, 2025 1 commit
-
-
Daniel Hiltgen authored
Notable EOLs with this change: - MacOS v12 and v13 are no longer supported (v14+ required) - AMD gfx900 and gfx906 are no longer supported
-
- 01 Oct, 2025 1 commit
-
-
Daniel Hiltgen authored
This revamps how we discover GPUs in the system by leveraging the Ollama runner. This should eliminate inconsistency between our GPU discovery and the runners capabilities at runtime, particularly for cases where we try to filter out unsupported GPUs. Now the runner does that implicitly based on the actual device list. In some cases free VRAM reporting can be unreliable which can leaad to scheduling mistakes, so this also includes a patch to leverage more reliable VRAM reporting libraries if available. Automatic workarounds have been removed as only one GPU leveraged this, which is now documented. This GPU will soon fall off the support matrix with the next ROCm bump. Additional cleanup of the scheduler and discovery packages can be done in the future once we have switched on the new memory management code, and removed support for the llama runner.
-
- 05 Jul, 2025 1 commit
-
-
Daniel Hiltgen authored
-
- 23 Jun, 2025 1 commit
-
-
Daniel Hiltgen authored
* Re-remove cuda v11 Revert the revert - drop v11 support requiring drivers newer than Feb 23 This reverts commit c6bcdc42. * Simplify layout With only one version of the GPU libraries, we can simplify things down somewhat. (Jetsons still require special handling) * distinct sbsa variant for linux arm64 This avoids accidentally trying to load the sbsa cuda libraries on a jetson system which results in crashes. * temporary prevent rocm+cuda mixed loading
-
- 13 May, 2025 1 commit
-
-
Daniel Hiltgen authored
Bring back v11 until we can better warn users that their driver is too old. This reverts commit fa393554.
-
- 07 May, 2025 1 commit
-
-
Daniel Hiltgen authored
This reduces the size of our Windows installer payloads by ~256M by dropping support for nvidia drivers older than Feb 2023. Hardware support is unchanged. Linux default bundle sizes are reduced by ~600M to 1G.
-
- 13 Feb, 2025 1 commit
-
-
frob authored
Co-authored-by:Richard Lyons <frob@cloudstaff.com>
-
- 20 Jan, 2025 1 commit
-
-
EndoTheDev authored
-
- 10 Dec, 2024 1 commit
-
-
Daniel Hiltgen authored
* llama: wire up builtin runner This adds a new entrypoint into the ollama CLI to run the cgo built runner. On Mac arm64, this will have GPU support, but on all other platforms it will be the lowest common denominator CPU build. After we fully transition to the new Go runners more tech-debt can be removed and we can stop building the "default" runner via make and rely on the builtin always. * build: Make target improvements Add a few new targets and help for building locally. This also adjusts the runner lookup to favor local builds, then runners relative to the executable, and finally payloads. * Support customized CPU flags for runners This implements a simplified custom CPU flags pattern for the runners. When built without overrides, the runner name contains the vector flag we check for (AVX) to ensure we don't try to run on unsupported systems and crash. If the user builds a customized set, we omit the naming scheme and don't check for compatibility. This avoids checking requirements at runtime, so that logic has been removed as well. This can be used to build GPU runners with no vector flags, or CPU/GPU runners with additional flags (e.g. AVX512) enabled. * Use relative paths If the user checks out the repo in a path that contains spaces, make gets really confused so use relative paths for everything in-repo to avoid breakage. * Remove payloads from main binary * install: clean up prior libraries This removes support for v0.3.6 and older versions (before the tar bundle) and ensures we clean up prior libraries before extracting the bundle(s). Without this change, runners and dependent libraries could leak when we update and lead to subtle runtime errors.
-
- 26 Oct, 2024 1 commit
-
-
Daniel Hiltgen authored
* Better support for AMD multi-GPU This resolves a number of problems related to AMD multi-GPU setups on linux. The numeric IDs used by rocm are not the same as the numeric IDs exposed in sysfs although the ordering is consistent. We have to count up from the first valid gfx (major/minor/patch with non-zero values) we find starting at zero. There are 3 different env vars for selecting GPUs, and only ROCR_VISIBLE_DEVICES supports UUID based identification, so we should favor that one, and try to use UUIDs if detected to avoid potential ordering bugs with numeric IDs * ROCR_VISIBLE_DEVICES only works on linux Use the numeric ID only HIP_VISIBLE_DEVICES on windows
-
- 05 Sep, 2024 1 commit
-
-
Michael authored
* Update gpu.md Seems strange that the laptop versions of 3050 and 3050 Ti would be supported but not the non-notebook, but this is what the page (https://developer.nvidia.com/cuda-gpus ) says. Signed-off-by:bean5 <2052646+bean5@users.noreply.github.com> * Update gpu.md Remove notebook reference --------- Signed-off-by:
bean5 <2052646+bean5@users.noreply.github.com>
-
- 20 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
The v5 hip library returns unsupported GPUs which wont enumerate at inference time in the runner so this makes sure we align discovery. The gfx906 cards are no longer supported so we shouldn't compile with that GPU type as it wont enumerate at runtime.
-
- 01 Jul, 2024 1 commit
-
-
Eduard authored
Runs fine on a NVIDIA GeForce GTX 1050 Ti
-
- 14 Jun, 2024 1 commit
-
-
Patrick Devine authored
-
- 21 Mar, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Bruce MacDonald authored
-