"inference_use_pkl.py" did not exist on "cc1d6094e7dbc2c815b92cadf6023665ad45a44f"
- 01 Jul, 2024 1 commit
-
-
Josh Yan authored
-
- 01 Apr, 2024 1 commit
-
-
Daniel Hiltgen authored
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.
-