- 08 Oct, 2024 1 commit
-
-
Daniel Hiltgen authored
The recent change to applying patches leaves the submodule dirty based on "new commits" being present. This ensures we clean up so the tree no longer reports dirty after a `go generate ./...` run. The Makefile was being a bit too aggressive in cleaning things up and would result in deleting the placeholder files which someone might accidentally commit.
-
- 17 Sep, 2024 1 commit
-
-
Michael Yang authored
raw diffs can be applied using `git apply` but not with `git am`. git patches, e.g. through `git format-patch` are both apply-able and am-able
-
- 13 Sep, 2024 1 commit
-
-
Daniel Hiltgen authored
scripts: fix incremental builds on linux or similar
-
- 12 Sep, 2024 1 commit
-
-
Daniel Hiltgen authored
* Optimize container images for startup This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images. * Refactor payload logic and add buildx support for faster builds * Move payloads around * Review comments * Converge to buildx based helper scripts * Use docker buildx action for release
-
- 29 Aug, 2024 1 commit
-
-
Michael Yang authored
-
- 19 Aug, 2024 2 commits
-
-
Daniel Hiltgen authored
This should help speed things up a little
-
Daniel Hiltgen authored
This adjusts linux to follow a similar model to windows with a discrete archive (zip/tgz) to cary the primary executable, and dependent libraries. Runners are still carried as payloads inside the main binary Darwin retain the payload model where the go binary is fully self contained.
-
- 06 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
* Revert "fix cmake build (#5505)" This reverts commit 4fd5f352. * llm: fix missing dylibs by restoring old build behavior * crlf -> lf
-
- 05 Jul, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 25 Apr, 2024 1 commit
-
-
Roy Yang authored
-
- 01 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-
- 25 Mar, 2024 1 commit
-
-
Jeremy authored
-
- 12 Mar, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 07 Mar, 2024 1 commit
-
-
John authored
Signed-off-by:hishope <csqiye@126.com>
-
- 29 Feb, 2024 1 commit
-
-
Bernhard M. Wiedemann authored
See https://reproducible-builds.org/ for why this is good. This patch was done while working on reproducible builds for openSUSE.
-
- 02 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
Only apply patches if we have any, and make sure to cleanup every file we patched at the end to leave the tree clean
-
- 25 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
* Fix clearing kv cache between requests with the same prompt * fix powershell script
-
- 20 Jan, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Jeffrey Morgan authored
-
- 19 Jan, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 17 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This also refines the build process for the ext_server build.
-
- 13 Jan, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 11 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
This reduces the built-in linux version to not use any vector extensions which enables the resulting builds to run under Rosetta on MacOS in Docker. Then at runtime it checks for the actual CPU vector extensions and loads the best CPU library available
-
- 05 Jan, 2024 1 commit
-
-
Bruce MacDonald authored
-
- 04 Jan, 2024 3 commits
-
-
Daniel Hiltgen authored
If the tree has a stale submodule, make sure we clean it up first
-
Daniel Hiltgen authored
-
Jeffrey Morgan authored
* update cmake flags for intel macOS * remove `LLAMA_K_QUANTS` * put back `CMAKE_OSX_DEPLOYMENT_TARGET` and disable `LLAMA_F16C`
-
- 02 Jan, 2024 3 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Refactor where we store build outputs, and support a fully dynamic loading model on windows so the base executable has no special dependencies thus doesn't require a special PATH.
-
Daniel Hiltgen authored
This changes the model for llama.cpp inclusion so we're not applying a patch, but instead have the C++ code directly in the ollama tree, which should make it easier to refine and update over time.
-
- 22 Dec, 2023 2 commits
-
-
Daniel Hiltgen authored
By default builds will now produce non-debug and non-verbose binaries. To enable verbose logs in llama.cpp and debug symbols in the native code, set `CGO_CFLAGS=-g`
-
Daniel Hiltgen authored
-
- 20 Dec, 2023 1 commit
-
-
Daniel Hiltgen authored
This switches the default llama.cpp to be CPU based, and builds the GPU variants as dynamically loaded libraries which we can select at runtime. This also bumps the ROCm library to version 6 given 5.7 builds don't work on the latest ROCm library that just shipped.
-
- 19 Dec, 2023 3 commits
-
-
Daniel Hiltgen authored
This changes the container-based linux build to use an older Ubuntu distro to improve our compatibility matrix for older user machines
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.
-