- 09 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
During testing, we're seeing some models take over 3 minutes.
-
- 06 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 03 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 02 Apr, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Michael Yang authored
-
- 01 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-