- 25 May, 2024 1 commit
-
-
Daniel Hiltgen authored
If the client closes the connection before we finish loading the model we abort, so lets make the log message clearer why to help users understand this failure mode
-
- 24 May, 2024 1 commit
-
-
Jeffrey Morgan authored
-
- 23 May, 2024 7 commits
-
-
Daniel Hiltgen authored
Tidy up developer guide a little
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Daniel Hiltgen authored
Wire up load progress
-
Daniel Hiltgen authored
This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load
-
Bruce MacDonald authored
Co-authored-by:ManniX-ITA <20623405+mann1x@users.noreply.github.com>
-
Jeffrey Morgan authored
* put flash attention behind flag for now * add test * remove print * up timeout for sheduler tests
-
- 22 May, 2024 3 commits
-
-
Michael authored
-
Ikko Eltociear Ashimine authored
PreTokenziers -> PreTokenizers
-
Josh authored
add Ctrl + W shortcut
-
- 21 May, 2024 8 commits
-
-
Josh Yan authored
-
Patrick Devine authored
-
Michael Yang authored
simplify safetensors reading
-
Michael Yang authored
Convert directly from llama3
-
Sang Park authored
The spelling of the term "request" has been corrected, which was previously mistakenly written as "requeset" in the error log message.
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 20 May, 2024 18 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Michael Yang authored
fix quantize file types
-
jmorganca authored
-
Josh Yan authored
-
Michael Yang authored
-
Michael Yang authored
-
alwqx authored
-
Michael Yang authored
cache and reuse intermediate blobs
-
Sam authored
* feat: enable flash attention if supported * feat: enable flash attention if supported * feat: enable flash attention if supported * feat: add flash_attn support
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
Patrick Devine authored
-
jmorganca authored
-
- 18 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 17 May, 2024 1 commit
-
-
Daniel Hiltgen authored
Don't return error on signal exit
-