- 17 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 16 Apr, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 15 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
* terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading * use `unload` in signal handler
-
- 10 Apr, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 09 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
During testing, we're seeing some models take over 3 minutes.
-
- 06 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 03 Apr, 2024 1 commit
-
-
Michael Yang authored
-
- 02 Apr, 2024 2 commits
-
-
Daniel Hiltgen authored
-
Michael Yang authored
-
- 01 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-