- 06 Nov, 2024 2 commits
-
-
Konrad Zawora authored
Signed-off-by:
yuwenzho <yuwen.zhou@intel.com> Signed-off-by:
Chendi.Xue <chendi.xue@intel.com> Signed-off-by:
Bob Zhu <bob.zhu@intel.com> Signed-off-by:
zehao-intel <zehao.huang@intel.com> Signed-off-by:
Konrad Zawora <kzawora@habana.ai> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com> Co-authored-by:
Sanju C Sudhakaran <scsudhakaran@habana.ai> Co-authored-by:
Michal Adamczyk <madamczyk@habana.ai> Co-authored-by:
Marceli Fylcek <mfylcek@habana.ai> Co-authored-by:
Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com> Co-authored-by:
Vivek Goel <vgoel@habana.ai> Co-authored-by:
yuwenzho <yuwen.zhou@intel.com> Co-authored-by:
Dominika Olszewska <dolszewska@habana.ai> Co-authored-by:
barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com> Co-authored-by:
Michal Szutenberg <37601244+szutenberg@users.noreply.github.com> Co-authored-by:
Jan Kaniecki <jkaniecki@habana.ai> Co-authored-by:
Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com> Co-authored-by:
Krzysztof Wisniewski <kwisniewski@habana.ai> Co-authored-by:
Dudi Lester <160421192+dudilester@users.noreply.github.com> Co-authored-by:
Ilia Taraban <tarabanil@gmail.com> Co-authored-by:
Chendi.Xue <chendi.xue@intel.com> Co-authored-by:
Michał Kuligowski <mkuligowski@habana.ai> Co-authored-by:
Jakub Maksymczuk <jmaksymczuk@habana.ai> Co-authored-by:
Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com> Co-authored-by:
Sun Choi <schoi@habana.ai> Co-authored-by:
Iryna Boiko <iboiko@habana.ai> Co-authored-by:
Bob Zhu <41610754+czhu15@users.noreply.github.com> Co-authored-by:
hlin99 <73271530+hlin99@users.noreply.github.com> Co-authored-by:
Zehao Huang <zehao.huang@intel.com> Co-authored-by:
Andrzej Kotłowski <Andrzej.Kotlowski@intel.com> Co-authored-by:
Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com> Co-authored-by:
Nir David <ndavid@habana.ai> Co-authored-by:
Yu-Zhou <yu.zhou@intel.com> Co-authored-by:
Ruheena Suhani Shaik <rsshaik@habana.ai> Co-authored-by:
Karol Damaszke <kdamaszke@habana.ai> Co-authored-by:
Marcin Swiniarski <mswiniarski@habana.ai> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Jacek Czaja <jacek.czaja@intel.com> Co-authored-by:
Jacek Czaja <jczaja@habana.ai> Co-authored-by:
Yuan <yuan.zhou@outlook.com>
-
Aaron Pham authored
Signed-off-by:Aaron Pham <contact@aarnphm.xyz>
-
- 17 Oct, 2024 1 commit
-
-
Luka Govedič authored
-
- 16 Oct, 2024 1 commit
-
-
Cyrus Leung authored
-
- 12 Sep, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 11 Sep, 2024 1 commit
-
-
Yang Fan authored
Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by:
DarkLight1337 <tlleungac@connect.ust.hk>
-
- 05 Sep, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 30 Aug, 2024 1 commit
-
-
Wenxiang authored
Co-authored-by:
Your Name <you@example.com> Co-authored-by:
Zeqi Lin <zelin@microsoft.com> Co-authored-by:
Zeqi Lin <Zeqi.Lin@microsoft.com>
-
- 19 Aug, 2024 2 commits
-
-
Woosuk Kwon authored
-
Woosuk Kwon authored
-
- 18 Aug, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 16 Aug, 2024 1 commit
-
-
Michael Goin authored
-
- 13 Aug, 2024 1 commit
-
-
youkaichao authored
-
- 26 Jul, 2024 1 commit
-
-
Michael Goin authored
-
- 23 Jul, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 19 Jul, 2024 1 commit
-
-
Simon Mo authored
-
- 28 Jun, 2024 1 commit
-
-
wangding zeng authored
Co-authored-by:Philipp Moritz <pcmoritz@gmail.com>
-
- 27 Jun, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 19 Jun, 2024 1 commit
-
-
Shukant Pal authored
-
- 17 Jun, 2024 2 commits
-
-
Kunshang Ji authored
Co-authored-by:
Jiang Li <jiang1.li@intel.com> Co-authored-by:
Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by:
Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
-
Amit Garg authored
-
- 12 Jun, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 05 Jun, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 18 May, 2024 1 commit
-
-
SangBin Cho authored
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files
-
- 17 May, 2024 1 commit
-
-
Jinzhen Lin authored
-
- 08 May, 2024 1 commit
-
-
SangBin Cho authored
-
- 01 May, 2024 1 commit
-
-
Jee Li authored
-
- 29 Apr, 2024 1 commit
-
-
SangBin Cho authored
-
- 25 Apr, 2024 1 commit
-
-
Caio Mendes authored
-
- 23 Apr, 2024 1 commit
-
-
SangBin Cho authored
-
- 12 Apr, 2024 1 commit
-
-
Michael Feil authored
Co-authored-by:Roger Wang <136131678+ywang96@users.noreply.github.com>
-
- 11 Apr, 2024 1 commit
-
-
Kunshang Ji authored
-
- 25 Mar, 2024 1 commit
-
-
Kunshang Ji authored
-
- 13 Mar, 2024 2 commits
-
-
Antoni Baum authored
-
Terry authored
-
- 23 Feb, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 22 Feb, 2024 1 commit
-
-
44670 authored
-
- 01 Feb, 2024 1 commit
-
-
Kunshang Ji authored
Co-authored-by:
Jiang Li <jiang1.li@intel.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
- 03 Dec, 2023 1 commit
-
-
Woosuk Kwon authored
-
- 30 Nov, 2023 1 commit
-
-
Roy authored
-