- 01 Apr, 2024 4 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-
Michael Yang authored
count each layer independently when deciding gpu offloading
-
Michael Yang authored
-
- 29 Mar, 2024 1 commit
-
-
Patrick Devine authored
Co-authored-by:Michael Yang <mxyng@pm.me>
-
- 26 Mar, 2024 3 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Patrick Devine authored
-
- 25 Mar, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Jeremy authored
-
- 24 Mar, 2024 1 commit
-
-
Blake Mizerany authored
-
- 23 Mar, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
The release just before ggml-cuda.cu refactoring
-
- 20 Mar, 2024 1 commit
-
-
Daniel Hiltgen authored
If expanding the runners fails, don't leave a corrupt/incomplete payloads dir We now write a pid file out to the tmpdir, which allows us to scan for stale tmpdirs and remove this as long as there isn't still a process running.
-
- 18 Mar, 2024 1 commit
-
-
Michael Yang authored
-
- 16 Mar, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 15 Mar, 2024 3 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Flesh out our github actions CI so we can build official releaes.
-
Blake Mizerany authored
This fixes some brittle, simple equality checks to use errors.Is. Since go1.13, errors.Is is the idiomatic way to check for errors. Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
- 14 Mar, 2024 1 commit
-
-
Michael Yang authored
-
- 13 Mar, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Bruce MacDonald authored
-
- 12 Mar, 2024 5 commits
-
-
Bruce MacDonald authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Michael Yang authored
-
racerole authored
Signed-off-by:racerole <jiangyifeng@outlook.com>
-
- 11 Mar, 2024 3 commits
-
-
Bruce MacDonald authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Putting the rocm symlink next to the runners is risky. This moves the payloads into a subdir to avoid potential clashes.
-
- 10 Mar, 2024 4 commits
-
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 09 Mar, 2024 4 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
The recent ROCm change partially removed idempotent payloads, but the ggml-metal.metal file for mac was still idempotent. This finishes switching to always extract the payloads, and now that idempotentcy is gone, the version directory is no longer useful.
-
Jeffrey Morgan authored
-
- 08 Mar, 2024 2 commits
-
-
Michael Yang authored
-
Jeffrey Morgan authored
-