- 24 May, 2024 2 commits
-
-
Patrick Devine authored
-
Jeffrey Morgan authored
-
- 23 May, 2024 7 commits
-
-
Daniel Hiltgen authored
Tidy up developer guide a little
-
Daniel Hiltgen authored
-
Michael Yang authored
-
Daniel Hiltgen authored
Wire up load progress
-
Daniel Hiltgen authored
This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load
-
Bruce MacDonald authored
Co-authored-by:ManniX-ITA <20623405+mann1x@users.noreply.github.com>
-
Jeffrey Morgan authored
* put flash attention behind flag for now * add test * remove print * up timeout for sheduler tests
-
- 22 May, 2024 3 commits
-
-
Michael authored
-
Ikko Eltociear Ashimine authored
PreTokenziers -> PreTokenizers
-
Josh authored
add Ctrl + W shortcut
-
- 21 May, 2024 8 commits
-
-
Josh Yan authored
-
Patrick Devine authored
-
Michael Yang authored
simplify safetensors reading
-
Michael Yang authored
Convert directly from llama3
-
Sang Park authored
The spelling of the term "request" has been corrected, which was previously mistakenly written as "requeset" in the error log message.
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
- 20 May, 2024 18 commits
-
-
Michael Yang authored
-
Michael Yang authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Patrick Devine authored
-
Michael Yang authored
fix quantize file types
-
jmorganca authored
-
Josh Yan authored
-
Michael Yang authored
-
Michael Yang authored
-
alwqx authored
-
Michael Yang authored
cache and reuse intermediate blobs
-
Sam authored
* feat: enable flash attention if supported * feat: enable flash attention if supported * feat: enable flash attention if supported * feat: add flash_attn support
-
Michael Yang authored
particularly useful for zipfiles and f16s
-
Patrick Devine authored
-
jmorganca authored
-
- 18 May, 2024 1 commit
-
-
Patrick Devine authored
-
- 17 May, 2024 1 commit
-
-
Daniel Hiltgen authored
Don't return error on signal exit
-