- 09 May, 2024 2 commits
-
-
Bruce MacDonald authored
-
Daniel Hiltgen authored
-
- 08 May, 2024 2 commits
-
-
Daniel Hiltgen authored
This records more GPU usage information for eventual UX inclusion.
-
Michael Yang authored
-
- 07 May, 2024 2 commits
-
-
Daniel Hiltgen authored
This will bubble up a much more informative error message if noexec is preventing us from running the subprocess
-
Michael Yang authored
-
- 06 May, 2024 5 commits
-
-
Michael Yang authored
-
Michael Yang authored
- FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32} -
Daniel Hiltgen authored
Trying to live off the land for cuda libraries was not the right strategy. We need to use the version we compiled against to ensure things work properly
-
Jeffrey Morgan authored
-
Jeffrey Morgan authored
* fix llava models not working after first request * individual requests only for llava models
-
- 05 May, 2024 1 commit
-
-
Daniel Hiltgen authored
This moves all the env var reading into one central module and logs the loaded config once at startup which should help in troubleshooting user server logs
-
- 04 May, 2024 1 commit
-
-
Michael Yang authored
-
- 01 May, 2024 5 commits
-
-
Mark Ward authored
-
Mark Ward authored
-
Mark Ward authored
log when the waiting for the process to stop to help debug when other tasks execute during this wait. expire timer clear the timer reference because it will not be reused. close will clean up expireTimer if calling code has not already done this.
-
Mark Ward authored
-
Jeffrey Morgan authored
-
- 30 Apr, 2024 4 commits
-
-
jmorganca authored
-
jmorganca authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
* Bump llama.cpp to b2761 * Adjust types for bump
-
- 29 Apr, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 27 Apr, 2024 3 commits
-
-
Hernan Martinez authored
-
Hernan Martinez authored
-
Hernan Martinez authored
-
- 26 Apr, 2024 9 commits
-
-
Daniel Hiltgen authored
This will speed up CI which already tries to only build static for unit tests
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
This will make it simpler for CI to accumulate artifacts from prior steps
-
- 25 Apr, 2024 4 commits
-
-
Jeffrey Morgan authored
* llm: limit generation to 10x context size to avoid run on generations * add comment * simplify condition statement
-
Michael Yang authored
-
jmorganca authored
-
Roy Yang authored
-
- 24 Apr, 2024 1 commit
-
-
Patrick Devine authored
-