- 02 Nov, 2024 1 commit
-
-
youkaichao authored
Signed-off-by:
youkaichao <youkaichao@gmail.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
- 31 Oct, 2024 1 commit
-
-
Roger Wang authored
Signed-off-by:Roger Wang <ywang@roblox.com>
-
- 30 Oct, 2024 1 commit
-
-
Went-Liang authored
Signed-off-by:Went-Liang <wenteng_liang@163.com>
-
- 27 Oct, 2024 1 commit
-
-
madt2709 authored
-
- 24 Oct, 2024 1 commit
-
-
Vinay R Damodaran authored
Signed-off-by:Vinay Damodaran <vrdn@hey.com>
-
- 22 Oct, 2024 1 commit
-
-
Travis Johnson authored
Signed-off-by:
Travis Johnson <tsjohnso@us.ibm.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
- 19 Oct, 2024 1 commit
-
-
Joe Runde authored
Signed-off-by:Joe Runde <Joseph.Runde@ibm.com>
-
- 18 Oct, 2024 1 commit
-
-
Cyrus Leung authored
-
- 17 Oct, 2024 1 commit
-
-
Kuntai Du authored
Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
-
- 16 Oct, 2024 2 commits
-
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
Cyrus Leung authored
-
- 11 Oct, 2024 3 commits
-
-
Wallas Henrique authored
Signed-off-by:Wallas Santos <wallashss@ibm.com>
-
Tyler Michael Smith authored
-
Cyrus Leung authored
-
- 07 Oct, 2024 1 commit
-
-
sroy745 authored
-
- 05 Oct, 2024 1 commit
-
-
Chen Zhang authored
Co-authored-by:Roger Wang <ywang@roblox.com>
-
- 04 Oct, 2024 2 commits
-
-
Roger Wang authored
Co-authored-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Michael Goin authored
-
- 03 Oct, 2024 2 commits
-
-
xendo authored
Co-authored-by:Jerzy Zagorski <jzagorsk@amazon.com>
-
sroy745 authored
-
- 02 Oct, 2024 1 commit
-
-
afeldman-nm authored
Co-authored-by:
Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by:
Andrew Feldman <afeld2012@gmail.com>
-
- 01 Oct, 2024 1 commit
-
-
Lily Liu authored
-
- 30 Sep, 2024 1 commit
-
-
Sebastian Schoennenbeck authored
-
- 27 Sep, 2024 1 commit
-
-
Varun Sundar Rabindranath authored
Co-authored-by:Varun Sundar Rabindranath <varun@neuralmagic.com>
-
- 23 Sep, 2024 2 commits
-
-
Alexander Matveev authored
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
- 17 Sep, 2024 2 commits
-
-
sroy745 authored
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
- 12 Sep, 2024 2 commits
-
-
Roger Wang authored
[Hotfix][Core][VLM] Disable chunked prefill by default and prefix caching for multimodal models (#8425)
-
youkaichao authored
-
- 11 Sep, 2024 1 commit
-
-
Aarni Koskela authored
-
- 10 Sep, 2024 1 commit
-
-
Cody Yu authored
[MISC] Keep chunked prefill enabled by default with long context when prefix caching is enabled (#8342)
-
- 07 Sep, 2024 1 commit
-
-
Cyrus Leung authored
-
- 06 Sep, 2024 1 commit
-
-
Patrick von Platen authored
Co-authored-by:Michael Goin <michael@neuralmagic.com>
-
- 04 Sep, 2024 1 commit
-
-
Harsha vardhan manoj Bikki authored
Co-authored-by:Harsha Bikki <harbikh@amazon.com>
-
- 02 Sep, 2024 1 commit
-
-
Isotr0py authored
-
- 30 Aug, 2024 1 commit
-
-
Cyrus Leung authored
-
- 27 Aug, 2024 2 commits
-
-
Patrick von Platen authored
-
Megha Agarwal authored
Co-authored-by:Alexander Matveev <alexm@neuralmagic.com>
-
- 26 Aug, 2024 1 commit
-
-
omrishiv authored
Signed-off-by:omrishiv <327609+omrishiv@users.noreply.github.com>
-