- 11 Nov, 2024 1 commit
-
-
Yangcheng Li authored
-
- 07 Nov, 2024 1 commit
-
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
- 06 Nov, 2024 2 commits
-
-
Konrad Zawora authored
Signed-off-by:
yuwenzho <yuwen.zhou@intel.com> Signed-off-by:
Chendi.Xue <chendi.xue@intel.com> Signed-off-by:
Bob Zhu <bob.zhu@intel.com> Signed-off-by:
zehao-intel <zehao.huang@intel.com> Signed-off-by:
Konrad Zawora <kzawora@habana.ai> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com> Co-authored-by:
Sanju C Sudhakaran <scsudhakaran@habana.ai> Co-authored-by:
Michal Adamczyk <madamczyk@habana.ai> Co-authored-by:
Marceli Fylcek <mfylcek@habana.ai> Co-authored-by:
Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com> Co-authored-by:
Vivek Goel <vgoel@habana.ai> Co-authored-by:
yuwenzho <yuwen.zhou@intel.com> Co-authored-by:
Dominika Olszewska <dolszewska@habana.ai> Co-authored-by:
barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com> Co-authored-by:
Michal Szutenberg <37601244+szutenberg@users.noreply.github.com> Co-authored-by:
Jan Kaniecki <jkaniecki@habana.ai> Co-authored-by: Agata Dobrzynie...
-
Aaron Pham authored
Signed-off-by:Aaron Pham <contact@aarnphm.xyz>
-
- 01 Nov, 2024 2 commits
-
-
Peter Salas authored
Signed-off-by:Peter Salas <peter@fixie.ai>
-
André Jonasson authored
Signed-off-by:André Jonasson <andre.jonasson@gmail.com>
-
- 24 Oct, 2024 1 commit
-
-
youkaichao authored
Co-authored-by:Zhuohan Li <zhuohan123@gmail.com>
-
- 22 Oct, 2024 1 commit
-
-
Kuntai Du authored
-
- 18 Oct, 2024 1 commit
-
-
Cyrus Leung authored
-
- 17 Oct, 2024 1 commit
-
-
Kuntai Du authored
Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
-
- 16 Oct, 2024 1 commit
-
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
- 11 Oct, 2024 3 commits
-
-
homeffjy authored
-
Tyler Michael Smith authored
-
youkaichao authored
Co-authored-by:Brendan Wong <bjwpokemon@gmail.com>
-
- 08 Oct, 2024 1 commit
-
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
- 07 Oct, 2024 3 commits
-
-
youkaichao authored
-
youkaichao authored
-
sroy745 authored
-
- 02 Oct, 2024 1 commit
-
-
afeldman-nm authored
Co-authored-by:
Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by:
Andrew Feldman <afeld2012@gmail.com>
-
- 29 Sep, 2024 2 commits
- 27 Sep, 2024 1 commit
-
-
Varun Sundar Rabindranath authored
Co-authored-by:Varun Sundar Rabindranath <varun@neuralmagic.com>
-
- 25 Sep, 2024 2 commits
-
-
Woo-Yeon Lee authored
-
Archit Patke authored
-
- 18 Sep, 2024 1 commit
-
-
Aaron Pham authored
Signed-off-by:
Aaron Pham <contact@aarnphm.xyz> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
- 08 Sep, 2024 1 commit
-
-
Alexander Matveev authored
-
- 02 Sep, 2024 1 commit
-
-
wang.yuqi authored
[Bugfix] Fix #7592 vllm 0.5.4 enable_chunked_prefill throughput is slightly lower than 0.5.3~0.5.0. (#7874)
-
- 29 Aug, 2024 1 commit
-
-
Alexander Matveev authored
-
- 28 Aug, 2024 3 commits
-
-
Cody Yu authored
-
Alexander Matveev authored
-
youkaichao authored
-
- 27 Aug, 2024 2 commits
-
-
Jonathan Berkhahn authored
-
Megha Agarwal authored
Co-authored-by:Alexander Matveev <alexm@neuralmagic.com>
-
- 26 Aug, 2024 1 commit
-
-
Cody Yu authored
-
- 19 Aug, 2024 2 commits
-
-
Cody Yu authored
-
SangBin Cho authored
-
- 16 Aug, 2024 1 commit
-
-
Mahesh Keralapura authored
[Core] Fix tracking of model forward time to the span traces in case of PP>1 (#7440)
-
- 14 Aug, 2024 1 commit
-
-
William Lin authored
-
- 09 Aug, 2024 2 commits
-
-
Cade Daniel authored
-
Mahesh Keralapura authored
-