- 27 Aug, 2024 1 commit
-
-
Patrick von Platen authored
-
- 22 Aug, 2024 2 commits
-
-
Abhinav Goyal authored
-
zifeitong authored
-
- 21 Aug, 2024 1 commit
-
-
Peter Salas authored
-
- 16 Aug, 2024 1 commit
-
-
Michael Goin authored
-
- 13 Aug, 2024 1 commit
-
-
Cyrus Leung authored
-
- 07 Aug, 2024 1 commit
-
-
Cyrus Leung authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 05 Aug, 2024 1 commit
-
-
Isotr0py authored
Co-authored-by:Michael Goin <michael@neuralmagic.com>
-
- 03 Aug, 2024 1 commit
-
-
Robert Shaw authored
Signed-off-by:
Joe Runde <Joseph.Runde@ibm.com> Co-authored-by:
Joe Runde <Joseph.Runde@ibm.com> Co-authored-by:
Joe Runde <joe@joerun.de> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Simon Mo <simon.mo@hey.com>
-
- 02 Aug, 2024 2 commits
-
-
youkaichao authored
-
Woosuk Kwon authored
-
- 31 Jul, 2024 1 commit
-
-
Cyrus Leung authored
-
- 29 Jul, 2024 1 commit
-
-
Isotr0py authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 26 Jul, 2024 1 commit
-
-
Michael Goin authored
-
- 23 Jul, 2024 2 commits
-
-
Roger Wang authored
-
Roger Wang authored
-
- 22 Jul, 2024 2 commits
-
-
Jiaxin Shan authored
Co-authored-by:Antoni Baum <antoni.baum@protonmail.com>
-
Roger Wang authored
-
- 20 Jul, 2024 1 commit
-
-
Antoni Baum authored
-
- 18 Jul, 2024 1 commit
-
-
Nick Hill authored
Co-authored-by:Cyrus Leung <cyrus.tl.leung@gmail.com>
-
- 11 Jul, 2024 1 commit
-
-
Robert Shaw authored
Co-authored-by:Zifei Tong <zifeitong@gmail.com>
-
- 10 Jul, 2024 1 commit
-
-
Abhinav Goyal authored
-
- 03 Jul, 2024 1 commit
-
-
Cyrus Leung authored
Signed-off-by:
Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by:
Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by:
ywang96 <ywang@roblox.com> Co-authored-by:
xwjiang2010 <87673679+xwjiang2010@users.noreply.github.com> Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com>
-
- 02 Jul, 2024 1 commit
-
-
xwjiang2010 authored
Signed-off-by:
Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
- 01 Jul, 2024 2 commits
-
-
Thomas Parnell authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Joshua Rosenkranz <jmrosenk@us.ibm.com>
-
Antoni Baum authored
-
- 29 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 28 Jun, 2024 1 commit
-
-
Cyrus Leung authored
Co-authored-by:ywang96 <ywang@roblox.com>
-
- 27 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 25 Jun, 2024 1 commit
-
-
Antoni Baum authored
-
- 21 Jun, 2024 1 commit
-
-
Joshua Rosenkranz authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Davis Wertheimer <Davis.Wertheimer@ibm.com>
-
- 15 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 11 Jun, 2024 2 commits
- 06 Jun, 2024 1 commit
-
-
liuyhwangyh authored
Co-authored-by:mulin.lyh <mulin.lyh@taobao.com>
-
- 03 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 25 May, 2024 1 commit
-
-
Eric Xihui Lin authored
Co-authored-by:
beagleski <yunanzhang@microsoft.com> Co-authored-by:
bapatra <bapatra@microsoft.com> Co-authored-by:
Barun Patra <codedecde@users.noreply.github.com> Co-authored-by:
Michael Goin <michael@neuralmagic.com>
-
- 22 May, 2024 1 commit
-
-
sasha0552 authored
-
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 09 May, 2024 1 commit
-
-
Hao Zhang authored
Co-authored-by:
Dash Desai <1723932+iamontheinet@users.noreply.github.com> Co-authored-by:
Aurick Qiao <qiao@aurick.net> Co-authored-by:
Aurick Qiao <aurick.qiao@snowflake.com> Co-authored-by:
Aurick Qiao <aurickq@users.noreply.github.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-