- 17 Dec, 2024 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
- 13 Dec, 2024 2 commits
-
-
Jani Monoses authored
-
zhuwenwen authored
-
- 11 Dec, 2024 3 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
王敏 authored
-
Tyler Michael Smith authored
Signed-off-by:Tyler Michael Smith <tyler@neuralmagic.com>
-
- 09 Dec, 2024 1 commit
-
-
Konrad Zawora authored
Signed-off-by:Konrad Zawora <kzawora@habana.ai>
-
- 05 Dec, 2024 1 commit
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 02 Dec, 2024 2 commits
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Maximilien de Bayser authored
Signed-off-by:Max de Bayser <mbayser@br.ibm.com>
-
- 27 Nov, 2024 2 commits
-
-
王敏 authored
2.更新medusa readme 3.解决benchmark_moe报错问题
-
Kunshang Ji authored
Signed-off-by:Kunshang Ji <kunshang.ji@intel.com>
-
- 23 Nov, 2024 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
- 22 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 21 Nov, 2024 1 commit
-
-
Pavani Majety authored
Signed-off-by:Pavani Majety <pmajety@nvidia.com>
-
- 20 Nov, 2024 2 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Li, Jiang authored
Signed-off-by:jiang1.li <jiang1.li@intel.com>
-
- 18 Nov, 2024 1 commit
-
-
王敏 authored
-
- 11 Nov, 2024 1 commit
-
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
- 07 Nov, 2024 2 commits
-
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Yan Ma authored
Signed-off-by:
Kunshang Ji <kunshang.ji@intel.com> Signed-off-by:
yan ma <yan.ma@intel.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
- 06 Nov, 2024 2 commits
-
-
Konrad Zawora authored
Signed-off-by:
yuwenzho <yuwen.zhou@intel.com> Signed-off-by:
Chendi.Xue <chendi.xue@intel.com> Signed-off-by:
Bob Zhu <bob.zhu@intel.com> Signed-off-by:
zehao-intel <zehao.huang@intel.com> Signed-off-by:
Konrad Zawora <kzawora@habana.ai> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com> Co-authored-by:
Sanju C Sudhakaran <scsudhakaran@habana.ai> Co-authored-by:
Michal Adamczyk <madamczyk@habana.ai> Co-authored-by:
Marceli Fylcek <mfylcek@habana.ai> Co-authored-by:
Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com> Co-authored-by:
Vivek Goel <vgoel@habana.ai> Co-authored-by:
yuwenzho <yuwen.zhou@intel.com> Co-authored-by:
Dominika Olszewska <dolszewska@habana.ai> Co-authored-by:
barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com> Co-authored-by:
Michal Szutenberg <37601244+szutenberg@users.noreply.github.com> Co-authored-by:
Jan Kaniecki <jkaniecki@habana.ai> Co-authored-by:
Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com> Co-authored-by:
Krzysztof Wisniewski <kwisniewski@habana.ai> Co-authored-by:
Dudi Lester <160421192+dudilester@users.noreply.github.com> Co-authored-by:
Ilia Taraban <tarabanil@gmail.com> Co-authored-by:
Chendi.Xue <chendi.xue@intel.com> Co-authored-by:
Michał Kuligowski <mkuligowski@habana.ai> Co-authored-by:
Jakub Maksymczuk <jmaksymczuk@habana.ai> Co-authored-by:
Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com> Co-authored-by:
Sun Choi <schoi@habana.ai> Co-authored-by:
Iryna Boiko <iboiko@habana.ai> Co-authored-by:
Bob Zhu <41610754+czhu15@users.noreply.github.com> Co-authored-by:
hlin99 <73271530+hlin99@users.noreply.github.com> Co-authored-by:
Zehao Huang <zehao.huang@intel.com> Co-authored-by:
Andrzej Kotłowski <Andrzej.Kotlowski@intel.com> Co-authored-by:
Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com> Co-authored-by:
Nir David <ndavid@habana.ai> Co-authored-by:
Yu-Zhou <yu.zhou@intel.com> Co-authored-by:
Ruheena Suhani Shaik <rsshaik@habana.ai> Co-authored-by:
Karol Damaszke <kdamaszke@habana.ai> Co-authored-by:
Marcin Swiniarski <mswiniarski@habana.ai> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Jacek Czaja <jacek.czaja@intel.com> Co-authored-by:
Jacek Czaja <jczaja@habana.ai> Co-authored-by:
Yuan <yuan.zhou@outlook.com>
-
Peter Salas authored
Signed-off-by:Peter Salas <peter@fixie.ai>
-
- 04 Nov, 2024 1 commit
-
-
Yang Zheng authored
Co-authored-by:Yang Zheng(SW)(Alex) <you@example.com>
-
- 02 Nov, 2024 1 commit
-
-
sroy745 authored
-
- 01 Nov, 2024 3 commits
-
-
Peter Salas authored
Signed-off-by:Peter Salas <peter@fixie.ai>
-
Pavani Majety authored
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
- 31 Oct, 2024 1 commit
-
-
sasha0552 authored
[Bugfix] Fix `illegal memory access` error with chunked prefill, prefix caching, block manager v2 and xformers enabled together (#9532) Signed-off-by:sasha0552 <admin@sasha0552.org>
-
- 30 Oct, 2024 1 commit
-
-
Elfie Guo authored
-
- 26 Oct, 2024 1 commit
-
-
ErkinSagiroglu authored
-
- 24 Oct, 2024 1 commit
-
-
王敏 authored
-
- 22 Oct, 2024 1 commit
-
-
wangshuai09 authored
-
- 21 Oct, 2024 1 commit
-
-
Thomas Parnell authored
Signed-off-by:Thomas Parnell <tpa@zurich.ibm.com>
-
- 20 Oct, 2024 1 commit
-
-
Chen Zhang authored
-
- 19 Oct, 2024 1 commit
-
-
Thomas Parnell authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Chih-Chieh Yang <chih.chieh.yang@ibm.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 17 Oct, 2024 2 commits
-
-
Robert Shaw authored
Signed-off-by:
Max de Bayser <maxdebayser@gmail.com> Signed-off-by:
Max de Bayser <mbayser@br.ibm.com> Co-authored-by:
Andrew Feldman <afeldman@neuralmagic.com> Co-authored-by:
afeldman-nm <156691304+afeldman-nm@users.noreply.github.com> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
laishzh <laishengzhang@gmail.com> Co-authored-by:
Max de Bayser <maxdebayser@gmail.com> Co-authored-by:
Max de Bayser <mbayser@br.ibm.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Kuntai Du authored
Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
-
- 14 Oct, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 13 Oct, 2024 1 commit
-
-
Lily Liu authored
-