- 13 Feb, 2025 1 commit
-
-
frob authored
Co-authored-by:Richard Lyons <frob@cloudstaff.com>
-
- 20 Jan, 2025 1 commit
-
-
EndoTheDev authored
-
- 10 Dec, 2024 1 commit
-
-
Daniel Hiltgen authored
* llama: wire up builtin runner This adds a new entrypoint into the ollama CLI to run the cgo built runner. On Mac arm64, this will have GPU support, but on all other platforms it will be the lowest common denominator CPU build. After we fully transition to the new Go runners more tech-debt can be removed and we can stop building the "default" runner via make and rely on the builtin always. * build: Make target improvements Add a few new targets and help for building locally. This also adjusts the runner lookup to favor local builds, then runners relative to the executable, and finally payloads. * Support customized CPU flags for runners This implements a simplified custom CPU flags pattern for the runners. When built without overrides, the runner name contains the vector flag we check for (AVX) to ensure we don't try to run on unsupported systems and crash. If the user builds a customized set, we omit the naming scheme and don't check for compatibility. This avoids checking requirements at runtime, so that logic has been removed as well. This can be used to build GPU runners with no vector flags, or CPU/GPU runners with additional flags (e.g. AVX512) enabled. * Use relative paths If the user checks out the repo in a path that contains spaces, make gets really confused so use relative paths for everything in-repo to avoid breakage. * Remove payloads from main binary * install: clean up prior libraries This removes support for v0.3.6 and older versions (before the tar bundle) and ensures we clean up prior libraries before extracting the bundle(s). Without this change, runners and dependent libraries could leak when we update and lead to subtle runtime errors.
-
- 26 Oct, 2024 1 commit
-
-
Daniel Hiltgen authored
* Better support for AMD multi-GPU This resolves a number of problems related to AMD multi-GPU setups on linux. The numeric IDs used by rocm are not the same as the numeric IDs exposed in sysfs although the ordering is consistent. We have to count up from the first valid gfx (major/minor/patch with non-zero values) we find starting at zero. There are 3 different env vars for selecting GPUs, and only ROCR_VISIBLE_DEVICES supports UUID based identification, so we should favor that one, and try to use UUIDs if detected to avoid potential ordering bugs with numeric IDs * ROCR_VISIBLE_DEVICES only works on linux Use the numeric ID only HIP_VISIBLE_DEVICES on windows
-
- 05 Sep, 2024 1 commit
-
-
Michael authored
* Update gpu.md Seems strange that the laptop versions of 3050 and 3050 Ti would be supported but not the non-notebook, but this is what the page (https://developer.nvidia.com/cuda-gpus ) says. Signed-off-by:bean5 <2052646+bean5@users.noreply.github.com> * Update gpu.md Remove notebook reference --------- Signed-off-by:
bean5 <2052646+bean5@users.noreply.github.com>
-
- 20 Jul, 2024 1 commit
-
-
Daniel Hiltgen authored
The v5 hip library returns unsupported GPUs which wont enumerate at inference time in the runner so this makes sure we align discovery. The gfx906 cards are no longer supported so we shouldn't compile with that GPU type as it wont enumerate at runtime.
-
- 01 Jul, 2024 1 commit
-
-
Eduard authored
Runs fine on a NVIDIA GeForce GTX 1050 Ti
-
- 14 Jun, 2024 1 commit
-
-
Patrick Devine authored
-
- 21 Mar, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Bruce MacDonald authored
-