- 14 Jun, 2024 1 commit
-
-
Daniel Hiltgen authored
Still not complete, needs some refinement to our prediction to understand the discrete GPUs available space so we can see how many layers fit in each one since we can't split one layer across multiple GPUs we can't treat free space as one logical block
-
- 13 Jun, 2024 2 commits
-
-
Patrick Devine authored
-
Jeffrey Morgan authored
-
- 12 Jun, 2024 1 commit
-
-
Michael Yang authored
multiple templates may appear in a model if a model is created from another model that 1) has an autodetected template and 2) defines a custom template
-
- 10 Jun, 2024 2 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
- 07 Jun, 2024 1 commit
-
-
Michael Yang authored
-
- 06 Jun, 2024 3 commits
-
-
Michael Yang authored
-
royjhan authored
* API app/browser access * Add tauri (resolves #2291, #4791, #3799, #4388)
-
royjhan authored
* Remove false time fields * Struct Separation for List and Process * Remove Marshaler
-
- 05 Jun, 2024 1 commit
-
-
Blake Mizerany authored
-
- 04 Jun, 2024 7 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 24 May, 2024 2 commits
-
-
Patrick Devine authored
-
Tim Scheuermann authored
-
- 23 May, 2024 1 commit
-
-
Jeffrey Morgan authored
* put flash attention behind flag for now * add test * remove print * up timeout for sheduler tests
-
- 21 May, 2024 1 commit
-
-
Sang Park authored
The spelling of the term "request" has been corrected, which was previously mistakenly written as "requeset" in the error log message.
-
- 20 May, 2024 4 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
Patrick Devine authored
-
- 16 May, 2024 1 commit
-
-
Daniel Hiltgen authored
-
- 15 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 14 May, 2024 12 commits
-
-
Patrick Devine authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Ryo Machida authored
* Fixed the API endpoint /api/tags to return {models: []} instead of {models: null} when the model list is empty. * Update server/routes.go --------- Co-authored-by:Jeffrey Morgan <jmorganca@gmail.com>
-
Daniel Hiltgen authored
The APIs we query are optimistic on free space, and windows pages VRAM, so we don't have to wait to see reported usage recover on unload
-
Patrick Devine authored
-