- 10 Jul, 2024 1 commit
-
-
sroy745 authored
[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765)
-
- 25 Jun, 2024 1 commit
-
-
Woo-Yeon Lee authored
[Speculative Decoding] Support draft model on different tensor-parallel size than target model (#5414)
-
- 05 Jun, 2024 1 commit
-
-
Nick Hill authored
-