- 12 Sep, 2024 1 commit
-
-
Daniel Hiltgen authored
* Optimize container images for startup This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images. * Refactor payload logic and add buildx support for faster builds * Move payloads around * Review comments * Converge to buildx based helper scripts * Use docker buildx action for release
-
- 01 May, 2024 1 commit
-
-
Mark Ward authored
-
- 01 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-
- 15 Feb, 2024 1 commit
-
-
Daniel Hiltgen authored
This focuses on Windows first, but coudl be used for Mac and possibly linux in the future.
-
- 19 Dec, 2023 1 commit
-
-
Daniel Hiltgen authored
Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.
-
- 27 Nov, 2023 1 commit
-
-
Jason Jacobs authored
-
- 24 Nov, 2023 1 commit
-
-
Jing Zhang authored
* Support cuda build in Windows * Enable dynamic NumGPU allocation for Windows
-
- 18 Nov, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 30 Aug, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Bruce MacDonald authored
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
-
- 28 Jul, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 22 Jul, 2023 1 commit
-
-
jk1jk authored
-
- 12 Jul, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 11 Jul, 2023 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 06 Jul, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 26 Jun, 2023 1 commit
-
-
Bruce MacDonald authored
-
- 25 Jun, 2023 2 commits
-
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
-
- 23 Jun, 2023 3 commits
-
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
Bruce MacDonald authored
-
- 22 Jun, 2023 1 commit
-
-
Jeffrey Morgan authored
-