- 10 May, 2024 3 commits
-
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Jeffrey Morgan authored
* dont clamp ctx size in `PredictServerFit` * minimum 4 context * remove context warning
-
- 09 May, 2024 5 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Bruce MacDonald authored
-
Daniel Hiltgen authored
-
- 08 May, 2024 2 commits
-
-
Daniel Hiltgen authored
This records more GPU usage information for eventual UX inclusion.
-
Michael Yang authored
-
- 07 May, 2024 2 commits
-
-
Daniel Hiltgen authored
This will bubble up a much more informative error message if noexec is preventing us from running the subprocess
-
Michael Yang authored
-
- 06 May, 2024 5 commits
-
-
Michael Yang authored
-
Michael Yang authored
- FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32} -
Daniel Hiltgen authored
Trying to live off the land for cuda libraries was not the right strategy. We need to use the version we compiled against to ensure things work properly
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* fix llava models not working after first request * individual requests only for llava models
-
- 05 May, 2024 1 commit
-
-
Daniel Hiltgen authored
This moves all the env var reading into one central module and logs the loaded config once at startup which should help in troubleshooting user server logs
-
- 04 May, 2024 1 commit
-
-
Michael Yang authored
-
- 01 May, 2024 5 commits
-
-
Mark Ward authored
-
Mark Ward authored
-
Mark Ward authored
log when the waiting for the process to stop to help debug when other tasks execute during this wait. expire timer clear the timer reference because it will not be reused. close will clean up expireTimer if calling code has not already done this.
-
Mark Ward authored
-
Jeffrey Morgan authored
-
- 30 Apr, 2024 4 commits
-
-
jmorganca authored
-
jmorganca authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
* Bump llama.cpp to b2761 * Adjust types for bump
-
- 29 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 27 Apr, 2024 3 commits
-
-
Hernan Martinez authored
-
Hernan Martinez authored
-
Hernan Martinez authored
-
- 26 Apr, 2024 8 commits
-
-
Daniel Hiltgen authored
This will speed up CI which already tries to only build static for unit tests
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-