Commits · 4d3a2c284ecc47748273664c9a6ef302ff3adcbe · OpenDAS / vllm_cscc

27 Nov, 2024 1 commit
- add VLLM_OPTEST_MODELS_PATH/OPTEST_MODELS_PATH to load models from local path... · 3c9817d2
  zhuwenwen authored Nov 27, 2024
```
add VLLM_OPTEST_MODELS_PATH/OPTEST_MODELS_PATH  to load models from local path instead of Hugging Face Hub
```
  3c9817d2
07 Nov, 2024 1 commit
- [Feature] [Spec decode]: Combine chunked prefill with speculative decoding (#9291) · 9d43afcc
  Nicolò Lucchesi authored Nov 07, 2024
```
Signed-off-by: NickLucche <nlucches@redhat.com>
```
  9d43afcc
10 Jul, 2024 1 commit
- [Speculative Decoding] Enabling bonus token in speculative decoding for KV... · ae151d73
  sroy745 authored Jul 10, 2024
```
[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765)
```
  ae151d73
05 Jun, 2024 1 commit
- [Speculative Decoding] Add `ProposerWorkerBase` abstract class (#5252) · faf71bcd
  Nick Hill authored Jun 05, 2024
  
  faf71bcd
13 May, 2024 1 commit
- [Speculative decoding] Improve n-gram efficiency (#4724) · ce532ff4
  Cody Yu authored May 13, 2024
  
  ce532ff4
04 May, 2024 1 commit
- [Misc][Refactor] Introduce ExecuteModelData (#4540) · bc8ad684
  Cody Yu authored May 03, 2024
  
  bc8ad684
03 May, 2024 1 commit
- [Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518) · 3521ba4f
  SangBin Cho authored May 04, 2024
  
  3521ba4f
01 May, 2024 1 commit
- [Speculative decoding] Add ngram prompt lookup decoding (#4237) · b38e42fb
  leiwen83 authored May 02, 2024
```
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
```
  b38e42fb