"driver/driver.hip.cpp" did not exist on "4b616aad52807740908071e90e06e184d3177357"
- 05 Jun, 2024 1 commit
-
-
Blake Mizerany authored
-
- 04 Jun, 2024 7 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 24 May, 2024 2 commits
-
-
Patrick Devine authored
-
Tim Scheuermann authored
-
- 23 May, 2024 1 commit
-
-
Jeffrey Morgan authored
* put flash attention behind flag for now * add test * remove print * up timeout for sheduler tests
-
- 21 May, 2024 1 commit
-
-
Sang Park authored
The spelling of the term "request" has been corrected, which was previously mistakenly written as "requeset" in the error log message.
-
- 20 May, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
Patrick Devine authored
-
- 16 May, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 15 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 14 May, 2024 13 commits
-
-
Patrick Devine authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Ryo Machida authored
* Fixed the API endpoint /api/tags to return {models: []} instead of {models: null} when the model list is empty. * Update server/routes.go --------- Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Daniel Hiltgen authored
The APIs we query are optimistic on free space, and windows pages VRAM, so we don't have to wait to see reported usage recover on unload
-
Patrick Devine authored
-
Patrick Devine authored
-
- 12 May, 2024 2 commits
-
-
Michael Yang authored
- 10 May, 2024 5 commits
-
-
Daniel Hiltgen authored
-
Daniel Hiltgen authored
Make sure the first GPU has the most free space
-
Jeffrey Morgan authored
* rename `--quantization` to `--quantize` * backwards * Update api/types.go Co-authored-by:
Michael Yang <mxyng@pm.me> --------- Co-authored-by:
Michael Yang <mxyng@pm.me>
-
Jeffrey Morgan authored
* dont clamp ctx size in `PredictServerFit` * minimum 4 context * remove context warning
-
Michael Yang authored
-
- 09 May, 2024 2 commits
-
-
Jeffrey Morgan authored
-
Daniel Hiltgen authored
Ensure the runners are terminated
-