- 25 Sep, 2024 1 commit
-
-
Chen Zhang authored
Co-authored-by:
simon-mo <xmo@berkeley.edu> Co-authored-by:
Chang Su <chang.s.su@oracle.com> Co-authored-by:
Simon Mo <simon.mo@hey.com> Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
- 18 Sep, 2024 1 commit
-
-
Geun, Lim authored
Co-authored-by:Michael Goin <michael@neuralmagic.com>
-
- 06 Sep, 2024 1 commit
-
-
Nick Hill authored
-
- 02 Sep, 2024 1 commit
-
-
Shawn Tan authored
Co-authored-by:Nick Hill <nickhill@us.ibm.com>
-
- 30 Aug, 2024 1 commit
-
-
Yohan Na authored
-
- 22 Aug, 2024 1 commit
-
-
Abhinav Goyal authored
-
- 21 Aug, 2024 1 commit
-
-
Peter Salas authored
-
- 16 Aug, 2024 1 commit
-
-
Michael Goin authored
-
- 29 Jul, 2024 1 commit
-
-
Isotr0py authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 26 Jul, 2024 1 commit
-
-
Michael Goin authored
-
- 23 Jul, 2024 2 commits
-
-
Roger Wang authored
-
Roger Wang authored
-
- 22 Jul, 2024 1 commit
-
-
Roger Wang authored
-
- 10 Jul, 2024 1 commit
-
-
Abhinav Goyal authored
-
- 01 Jul, 2024 1 commit
-
-
Thomas Parnell authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Joshua Rosenkranz <jmrosenk@us.ibm.com>
-
- 27 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 21 Jun, 2024 1 commit
-
-
Joshua Rosenkranz authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Davis Wertheimer <Davis.Wertheimer@ibm.com>
-
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 09 May, 2024 1 commit
-
-
Hao Zhang authored
Co-authored-by:
Dash Desai <1723932+iamontheinet@users.noreply.github.com> Co-authored-by:
Aurick Qiao <qiao@aurick.net> Co-authored-by:
Aurick Qiao <aurick.qiao@snowflake.com> Co-authored-by:
Aurick Qiao <aurickq@users.noreply.github.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 26 Apr, 2024 1 commit
-
-
SangBin Cho authored
Co-authored-by:Danny Guinther <dguinther@neuralmagic.com>
-
- 23 Apr, 2024 1 commit
-
-
SangBin Cho authored
-
- 12 Apr, 2024 1 commit
-
-
Michael Feil authored
Co-authored-by:Roger Wang <136131678+ywang96@users.noreply.github.com>
-
- 27 Mar, 2024 1 commit
-
-
Megha Agarwal authored
-
- 25 Mar, 2024 1 commit
-
-
SangBin Cho authored
-
- 21 Mar, 2024 2 commits
-
-
Woosuk Kwon authored
Co-authored-by:
Roy <jasonailu87@gmail.com> Co-authored-by:
Roger Meier <r.meier@siemens.com>
-
Lalit Pradhan authored
-
- 11 Mar, 2024 1 commit
-
-
Zhuohan Li authored
-
- 29 Feb, 2024 1 commit
-
-
Seonghyeon authored
-
- 27 Feb, 2024 1 commit
-
-
Roy authored
-
- 19 Feb, 2024 1 commit
-
-
Isotr0py authored
-
- 14 Feb, 2024 1 commit
-
-
Roy authored
-
- 13 Feb, 2024 3 commits
-
-
Philipp Moritz authored
Co-authored-by:Roy <jasonailu87@gmail.com>
-
Philipp Moritz authored
This reverts commit 5c976a7e.
-
Roy authored
-
- 20 Nov, 2023 1 commit
-
-
Simon Mo authored
-
- 16 Nov, 2023 1 commit
-
-
Megha Agarwal authored
-
- 07 Nov, 2023 1 commit
-
-
GoHomeToMacDonal authored
-
- 06 Nov, 2023 1 commit
-
-
Roy authored
-
- 01 Nov, 2023 1 commit
-
-
Woosuk Kwon authored
-
- 13 Oct, 2023 1 commit
-
-
Lu Wang authored
-