Commits · 660470e5a36b8e52083615ad7c85e9b4fd4c72ce · OpenDAS / vllm_cscc

08 Jul, 2024 1 commit

[Kernel] Correctly invoke prefill & decode kernels for cross-attention... · 543aa485

afeldman-nm authored Jul 08, 2024


[Kernel] Correctly invoke prefill & decode kernels for cross-attention (towards eventual encoder/decoder model support) (#4888)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

543aa485

04 Jun, 2024 1 commit
- [Bugfix]: During testing, use pytest monkeypatch for safely overriding the env... · f42a006b
  afeldman-nm authored Jun 03, 2024
```
[Bugfix]: During testing, use pytest monkeypatch for safely overriding the env var that indicates the vLLM backend (#5210)
```
  f42a006b