- 17 Jul, 2024 1 commit
-
-
shangmingc authored
Co-authored-by:caishangming.csm <caishangming.csm@alibaba-inc.com>
-
- 10 Jul, 2024 1 commit
-
-
sroy745 authored
[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765)
-
- 15 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 05 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 22 May, 2024 1 commit
-
-
Nick Hill authored
-
- 13 May, 2024 1 commit
-
-
Cody Yu authored
-
- 10 May, 2024 1 commit
-
-
SangBin Cho authored
Storing exception frame is extremely prone to circular refernece because it contains the reference to objects. When tensorizer is not installed, it leaks llm instance because error frame has references to various modules which cause circular reference problem. I also found spec decoding has a circular reference issue, and I solved it using weakref.proxy.
-
- 08 May, 2024 1 commit
-
-
Cade Daniel authored
-
- 04 May, 2024 1 commit
-
-
Cody Yu authored
-
- 03 May, 2024 1 commit
-
-
Cade Daniel authored
-
- 01 May, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:Lei Wen <wenlei03@qiyi.com>
-