- 11 May, 2024 5 commits
-
-
todashuta authored
-
Michael Yang authored
-
Daniel Hiltgen authored
Fix envconfig unit test
-
Patrick Devine authored
-
- 10 May, 2024 15 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Fall back to CPU runner with zero layers
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Integration fixes
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Always use the sorted list of GPUs
-
Daniel Hiltgen authored
Make sure the first GPU has the most free space
-
Jeffrey Morgan authored
* rename `--quantization` to `--quantize` * backwards * Update api/types.go Co-authored-by:
Michael Yang <mxyng@pm.me> --------- Co-authored-by:
Michael Yang <mxyng@pm.me>
-
Michael Yang authored
add phi2 mem
-
Michael Yang authored
-
Jeffrey Morgan authored
* dont clamp ctx size in `PredictServerFit` * minimum 4 context * remove context warning
-
Daniel Hiltgen authored
Bump VRAM buffer back up
-
Daniel Hiltgen authored
Under stress scenarios we're seeing OOMs so this should help stabilize the allocations under heavy concurrency stress.
-
Michael Yang authored
-
Michael Yang authored
-
- 09 May, 2024 20 commits
-
-
Bruce MacDonald authored
-
Michael Yang authored
fix typo
-
Jeffrey Morgan authored
-
Michael Yang authored
-
Michael Yang authored
only forward some env vars
-
Michael Yang authored
log clean up
-
Daniel Hiltgen authored
Fix race in shutdown logic
-
Daniel Hiltgen authored
Ensure the runners are terminated
-
Zander Lewis authored
-
Michael Yang authored
-
Daniel Hiltgen authored
Wait for GPU free memory reporting to converge
-
Daniel Hiltgen authored
The GPU drivers take a while to update their free memory reporting, so we need to wait until the values converge with what we're expecting before proceeding to start another runner in order to get an accurate picture.
-
Michael Yang authored
-
Daniel Hiltgen authored
Record more GPU information
-
Daniel Hiltgen authored
This cleans up the logging for GPU discovery a bit, and can serve as a foundation to report GPU information in a future UX.
-
Daniel Hiltgen authored
Harden subprocess reaping
-
Bruce MacDonald authored
-
Michael Yang authored
routes: skip invalid filepaths
-
Michael Yang authored
-
Daniel Hiltgen authored
-