- 19 Jun, 2024 1 commit
-
-
Shukant Pal authored
-
- 17 Jun, 2024 2 commits
-
-
Kunshang Ji authored
Co-authored-by:
Jiang Li <jiang1.li@intel.com> Co-authored-by:
Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by:
Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
-
Amit Garg authored
-
- 12 Jun, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 05 Jun, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 17 May, 2024 1 commit
-
-
Jinzhen Lin authored
-
- 08 May, 2024 1 commit
-
-
SangBin Cho authored
-
- 01 May, 2024 1 commit
-
-
Jee Li authored
-
- 29 Apr, 2024 1 commit
-
-
SangBin Cho authored
-
- 25 Apr, 2024 1 commit
-
-
Caio Mendes authored
-
- 23 Apr, 2024 1 commit
-
-
SangBin Cho authored
-
- 12 Apr, 2024 1 commit
-
-
Michael Feil authored
Co-authored-by:Roger Wang <136131678+ywang96@users.noreply.github.com>
-
- 11 Apr, 2024 1 commit
-
-
Kunshang Ji authored
-
- 25 Mar, 2024 1 commit
-
-
Kunshang Ji authored
-
- 13 Mar, 2024 2 commits
-
-
Antoni Baum authored
-
Terry authored
-
- 23 Feb, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 22 Feb, 2024 1 commit
-
-
44670 authored
-
- 01 Feb, 2024 1 commit
-
-
Kunshang Ji authored
Co-authored-by:
Jiang Li <jiang1.li@intel.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
- 03 Dec, 2023 1 commit
-
-
Woosuk Kwon authored
-
- 30 Nov, 2023 2 commits
- 29 Nov, 2023 1 commit
-
-
Woosuk Kwon authored
-
- 24 Nov, 2023 1 commit
-
-
Yanming W authored
-
- 13 Nov, 2023 1 commit
-
-
Woosuk Kwon authored
-
- 03 Nov, 2023 1 commit
-
-
Antoni Baum authored
Signed-off-by:
Antoni Baum <antoni.baum@protonmail.com> Co-authored-by:
Viktor Ferenczi <viktor@ferenczi.eu> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 27 Sep, 2023 1 commit
-
-
Lily Liu authored
Co-authored-by:
Wing Lian <wing.lian@gmail.com> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu>
-