- 15 Jan, 2025 1 commit
-
-
wangxiyuan authored
Signed-off-by:
wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by:
Mengqing Cao <cmq0113@163.com> Co-authored-by:
Mengqing Cao <cmq0113@163.com>
-
- 09 Jan, 2025 1 commit
-
-
wangxiyuan authored
Signed-off-by:
wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by:
Mengqing Cao <cmq0113@163.com> Co-authored-by:
Mengqing Cao <cmq0113@163.com>
-
- 30 Dec, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 05 Dec, 2024 1 commit
-
-
zhuwenwen authored
-
- 04 Dec, 2024 1 commit
-
-
zhuwenwen authored
-
- 19 Nov, 2024 1 commit
-
-
Mengqing Cao authored
Signed-off-by:Mengqing Cao <cmq0113@163.com>
-
- 10 Nov, 2024 1 commit
-
-
zhuwenwen authored
-
- 06 Nov, 2024 1 commit
-
-
Joe Runde authored
Signed-off-by:Joe Runde <Joseph.Runde@ibm.com>
-
- 28 Oct, 2024 1 commit
-
-
wangshuai09 authored
Signed-off-by:wangshuai09 <391746016@qq.com>
-
- 26 Oct, 2024 1 commit
-
-
Mengqing Cao authored
-
- 22 Oct, 2024 1 commit
-
-
wangshuai09 authored
-
- 20 Oct, 2024 1 commit
-
-
Chen Zhang authored
-
- 11 Oct, 2024 1 commit
-
-
Tyler Michael Smith authored
-
- 18 Sep, 2024 1 commit
-
-
Cyrus Leung authored
-
- 06 Aug, 2024 1 commit
-
-
afeldman-nm authored
[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942) Co-authored-by:
Andrew Feldman <afeld2012@gmail.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com>
-
- 08 Jul, 2024 1 commit
-
-
afeldman-nm authored
[Kernel] Correctly invoke prefill & decode kernels for cross-attention (towards eventual encoder/decoder model support) (#4888) Co-authored-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 28 Jun, 2024 1 commit
-
-
Ilya Lavrenov authored
-
- 15 Jun, 2024 1 commit
-
-
zhuwenwen authored
-
- 04 Jun, 2024 1 commit
-
-
afeldman-nm authored
[Bugfix]: During testing, use pytest monkeypatch for safely overriding the env var that indicates the vLLM backend (#5210)
-
- 22 May, 2024 1 commit
-
-
Cody Yu authored
-