- 26 Mar, 2024 2 commits
-
-
Patrick Devine authored
-
Daniel Hiltgen authored
This should hopefully only be a temporary workaround until Rocky 8 picks up GCC 10.4 which fixes the NVCC bug
-
- 23 Mar, 2024 1 commit
-
-
Daniel Hiltgen authored
This uplevels the integration tests to run the server which can allow testing an existing server, or a remote server.
-
- 15 Mar, 2024 2 commits
-
-
Daniel Hiltgen authored
Flesh out our github actions CI so we can build official releaes.
-
Daniel Hiltgen authored
-
- 11 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 10 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 07 Mar, 2024 1 commit
-
-
Daniel Hiltgen authored
This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.
-
- 29 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
On OpenSUSE, ollama needs to be a member of the video group to access the GPU
-
- 27 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
Allow overriding the platform, image name, and tag latest for standard and rocm images.
-
- 22 Feb, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 21 Feb, 2024 3 commits
-
-
Josh authored
-
Jeffrey Morgan authored
* remove `-w -s` linker flags on windows * use `zip` for windows installer compression
-
Jeffrey Morgan authored
-
- 16 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
Also fixes a few fit-and-finish items for better developer experience
-
- 15 Feb, 2024 4 commits
-
-
Daniel Hiltgen authored
This will be useful for our automated test riggig, and may be useful for advanced users who want to "roll their own" system service
-
jmorganca authored
-
Daniel Hiltgen authored
This focuses on Windows first, but coudl be used for Mac and possibly linux in the future.
-
Daniel Hiltgen authored
-
- 09 Feb, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 26 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This adds ROCm support back as a discrete image.
-
- 23 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
If a VERSION is not specified, this will generate a version string that represents the state of the repo. For example `0.1.21-12-gffaf52e1-dirty` representing 12 commits away from 0.1.21 tag, on commit gffaf52e1 and the tree is dirty.
-
- 21 Jan, 2024 2 commits
-
-
Daniel Hiltgen authored
The linux build now support parallel CPU builds to speed things up. This also exposes AMD GPU targets as an optional setting for advaced users who want to alter our default set.
-
Daniel Hiltgen authored
This renames Dockerfile.build to Dockerfile, and adds some new stages to support 2 modes of building - the build_linux.sh script uses intermediate stages to extract the artifacts for ./dist, and the default build generates a container image usable by both cuda and rocm cards. This required transitioniing the x86 base to the rocm image to avoid layer bloat.
-
- 19 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This also refines the build process for the ext_server build.
-
- 16 Jan, 2024 1 commit
-
-
Michael Yang authored
repos for fedora 38 and newer do not exist as of this commit ``` $ dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora38/x86_64/cuda-fedora38.repo Adding repo from: https://developer.download.nvidia.com/compute/cuda/repos/fedora38/x86_64/cuda-fedora38.repo Status code: 404 for https://developer.download.nvidia.com/compute/cuda/repos/fedora38/x86_64/cuda-fedora38.repo (IP: 152.195.19.142) Error: Configuration of repo failed ```
-
- 11 Jan, 2024 2 commits
-
-
Daniel Hiltgen authored
This reduces the built-in linux version to not use any vector extensions which enables the resulting builds to run under Rosetta on MacOS in Docker. Then at runtime it checks for the actual CPU vector extensions and loads the best CPU library available
-
Daniel Hiltgen authored
-
- 10 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This can help speed up incremental builds when you're only testing one archicture, like amd64. E.g. BUILD_ARCH=amd64 ./scripts/build_linux.sh && scp ./dist/ollama-linux-amd64 test-system:
-
- 09 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 05 Jan, 2024 1 commit
-
-
Michael Yang authored
-
- 04 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This prevents users from accidentally installing on WSL1 with instructions guiding how to upgrade their WSL instance to version 2. Once running WSL2 if you have an NVIDIA card, you can follow their instructions to set up GPU passthrough and run models on the GPU. This is not possible on WSL1.
-
- 03 Jan, 2024 2 commits
-
-
Daniel Hiltgen authored
For the ROCm libraries to access the driver, we need to add the ollama user to the render group.
-
Jeffrey Morgan authored
-
- 23 Dec, 2023 1 commit
-
-
Daniel Hiltgen authored
This should help CI avoid running the integration test logic in a container where it's not currently possible.
-
- 22 Dec, 2023 3 commits
-
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
By default builds will now produce non-debug and non-verbose binaries. To enable verbose logs in llama.cpp and debug symbols in the native code, set `CGO_CFLAGS=-g`
-
Daniel Hiltgen authored
-