- 21 Jul, 2024 1 commit
-
-
sroy745 authored
[Spec Decode] Disable Log Prob serialization to CPU for spec decoding for both draft and target models. (#6485)
-
- 10 Jul, 2024 1 commit
-
-
sroy745 authored
[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765)
-
- 01 Jul, 2024 1 commit
-
-
sroy745 authored
-
- 05 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 25 May, 2024 1 commit
-
-
Lily Liu authored
-
- 08 May, 2024 1 commit
-
-
Cody Yu authored
Co-authored-by:Cade Daniel <edacih@gmail.com>
-