Commits · afd0da2186c1d58fb48e138df0a2f548612b5d7d · OpenDAS / vllm_cscc

15 Jan, 2025 1 commit

[Platform] Do not raise error if _Backend is not found (#12023) · 3adf0ffd

wangxiyuan authored Jan 15, 2025


Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>

3adf0ffd

09 Jan, 2025 1 commit

[platform] Allow platform specify attention backend (#11609) · 405eb8e3

wangxiyuan authored Jan 09, 2025


Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>

405eb8e3

30 Dec, 2024 1 commit
- [platforms] enable platform plugins (#11602) · b12e87f9
  youkaichao authored Dec 30, 2024
```
Signed-off-by: youkaichao <youkaichao@gmail.com>
```
  b12e87f9
05 Dec, 2024 1 commit
- added support for kernels tests with torch 2.3 · e150cf11
  zhuwenwen authored Dec 05, 2024
  
  e150cf11
04 Dec, 2024 1 commit
- remove unsupported tests from kernels and add pytest html · a3d96521
  zhuwenwen authored Dec 04, 2024
  
  a3d96521
19 Nov, 2024 1 commit
- [Platform][Refactor] Extract func `get_default_attn_backend` to `Platform` (#10358) · 8c1fb507
  Mengqing Cao authored Nov 19, 2024
```
Signed-off-by: Mengqing Cao <cmq0113@163.com>
```
  8c1fb507
10 Nov, 2024 1 commit
- update tests of kernels and basic_correctness · c012f7f6
  zhuwenwen authored Nov 10, 2024
  
  c012f7f6
06 Nov, 2024 1 commit
- [V1] Make v1 more testable (#9888) · d58268c5
  Joe Runde authored Nov 06, 2024
```
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
```
  d58268c5
28 Oct, 2024 1 commit
- [Hardware][ROCM] using current_platform.is_rocm (#9642) · 4e2d95e3
  wangshuai09 authored Oct 28, 2024
```
Signed-off-by: wangshuai09 <391746016@qq.com>
```
  4e2d95e3
26 Oct, 2024 1 commit
- [Hardware][openvino] is_openvino --> current_platform.is_openvino (#9716) · 5cbdccd1
  Mengqing Cao authored Oct 26, 2024
  
  5cbdccd1
22 Oct, 2024 1 commit
- [Hardware][CPU] using current_platform.is_cpu (#9536) · 3ddbe255
  wangshuai09 authored Oct 22, 2024
  
  3ddbe255
20 Oct, 2024 1 commit
- [Kernel] Support sliding window in flash attention backend (#9403) · 4fa3e333
  Chen Zhang authored Oct 20, 2024
  
  4fa3e333
11 Oct, 2024 1 commit
- [Model] Support Mamba (#6484) · 7342a7d7
  Tyler Michael Smith authored Oct 11, 2024
  
  7342a7d7
18 Sep, 2024 1 commit
- [CI/Build] Avoid CUDA initialization (#8534) · 6ffa3f31
  Cyrus Leung authored Sep 18, 2024
  
  6ffa3f31
06 Aug, 2024 1 commit

[Core] Subclass ModelRunner to support cross-attention & encoder sequences... · fd95e026

afeldman-nm authored Aug 06, 2024


[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942)
Co-authored-by: Andrew Feldman <afeld2012@gmail.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>

fd95e026

08 Jul, 2024 1 commit

[Kernel] Correctly invoke prefill & decode kernels for cross-attention... · 543aa485

afeldman-nm authored Jul 08, 2024


[Kernel] Correctly invoke prefill & decode kernels for cross-attention (towards eventual encoder/decoder model support) (#4888)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

543aa485

28 Jun, 2024 1 commit
- [Hardware][Intel] OpenVINO vLLM backend (#5379) · 57f09a41
  Ilya Lavrenov authored Jun 28, 2024
  
  57f09a41
15 Jun, 2024 1 commit
- update kernel test · 21c06ecb
  zhuwenwen authored Jun 15, 2024
  
  21c06ecb
04 Jun, 2024 1 commit
- [Bugfix]: During testing, use pytest monkeypatch for safely overriding the env... · f42a006b
  afeldman-nm authored Jun 03, 2024
```
[Bugfix]: During testing, use pytest monkeypatch for safely overriding the env var that indicates the vLLM backend (#5210)
```
  f42a006b
22 May, 2024 1 commit
- [Misc] Take user preference in attention selector (#4960) · ee3eea0a
  Cody Yu authored May 22, 2024
  
  ee3eea0a