- 21 Jun, 2024 1 commit
-
-
Joshua Rosenkranz authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Davis Wertheimer <Davis.Wertheimer@ibm.com>
-
- 15 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 11 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 05 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 25 May, 2024 1 commit
-
-
Lily Liu authored
-
- 22 May, 2024 1 commit
-
-
Nick Hill authored
-
- 16 May, 2024 1 commit
-
-
Cody Yu authored
Co-authored-by:
Cade Daniel <edacih@gmail.com> Co-authored-by:
Cade Daniel <cade@anyscale.com>
-
- 15 May, 2024 1 commit
-
-
SangBin Cho authored
[Core][2/N] Model runner refactoring part 2. Combine prepare prefill / decode to a single API (#4681) This PR combines prepare_prompt and prepare_decode into a single API. This PR also coelsce the attn metadata for prefill/decode to a single class and allow to slice them when running attn backend. It also refactors subquery_start_loc which was not refactored in the previous PR
-
- 13 May, 2024 1 commit
-
-
Cody Yu authored
-
- 11 May, 2024 1 commit
-
-
Chang Su authored
-
- 10 May, 2024 1 commit
-
-
SangBin Cho authored
Storing exception frame is extremely prone to circular refernece because it contains the reference to objects. When tensorizer is not installed, it leaks llm instance because error frame has references to various modules which cause circular reference problem. I also found spec decoding has a circular reference issue, and I solved it using weakref.proxy.
-
- 08 May, 2024 2 commits
-
-
Cade Daniel authored
-
Cody Yu authored
Co-authored-by:Cade Daniel <edacih@gmail.com>
-
- 07 May, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:
Lei Wen <wenlei03@qiyi.com> Co-authored-by:
Cade Daniel <edacih@gmail.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 04 May, 2024 1 commit
-
-
Cody Yu authored
-
- 03 May, 2024 1 commit
-
-
Cade Daniel authored
-
- 01 May, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:Lei Wen <wenlei03@qiyi.com>
-
- 26 Apr, 2024 1 commit
-
-
SangBin Cho authored
Co-authored-by:Danny Guinther <dguinther@neuralmagic.com>
-
- 23 Apr, 2024 1 commit
-
-
Cade Daniel authored
-
- 18 Apr, 2024 1 commit
-
-
SangBin Cho authored
Co-authored-by:SangBin Cho <sangcho@sangcho-LT93GQWG9C.local>
-
- 16 Apr, 2024 1 commit
-
-
Cade Daniel authored
-
- 09 Apr, 2024 1 commit
-
-
Cade Daniel authored
[Misc] [Core] Implement RFC "Augment BaseExecutor interfaces to enable hardware-agnostic speculative decoding" (#3837)
-
- 02 Apr, 2024 1 commit
-
-
Michael Goin authored
-
- 25 Mar, 2024 1 commit
-
-
SangBin Cho authored
-
- 22 Mar, 2024 1 commit
-
-
Zhuohan Li authored
-
- 11 Mar, 2024 1 commit
-
-
Zhuohan Li authored
-
- 09 Mar, 2024 1 commit
-
-
Cade Daniel authored
-