"vscode:/vscode.git/clone" did not exist on "f7aaff08b2dbe3352439724111cfd0f4a43a50b9"
- 01 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-
- 04 Jan, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 19 Dec, 2023 1 commit
-
-
Daniel Hiltgen authored
Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.
-
- 05 Sep, 2023 1 commit
-
-
Jeffrey Morgan authored
-
- 30 Aug, 2023 1 commit
-
-
Bruce MacDonald authored
* remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm
-