- 20 Jan, 2026 6 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Jackmin801 authored
Signed-off-by:
Jackmin801 <ongjackm@gmail.com> Signed-off-by:
Jackmin801 <56836461+Jackmin801@users.noreply.github.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
- 19 Jan, 2026 24 commits
-
-
Matthew Bonanni authored
[Attention][MLA] Make FLASHINFER_MLA the default MLA backend on Blackwell, and TRTLLM the default prefill (#32615) Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Tomas Ruiz authored
Signed-off-by:Tomas Ruiz <tomas.ruiz.te@gmail.com>
-
lon authored
Signed-off-by:
lon <114724657+longregen@users.noreply.github.com> Signed-off-by:
Russell Bryant <russell.bryant@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Russell Bryant <russell.bryant@gmail.com>
-
jiahanc authored
Signed-off-by:
jiahanc <173873397+jiahanc@users.noreply.github.com> Signed-off-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com>
-
Yanan Cao authored
Signed-off-by:Yanan Cao <gmagogsfm@gmail.com>
-
Vadim Gimpelson authored
[BUGFIX] Fix `test_mla_backends.py`. Scale MLA projection weights to prevent numerical instability (#32529) Signed-off-by:Vadim Gimpelson <vadim.gimpelson@gmail.com>
-
qli88 authored
Signed-off-by:
Qiang Li <qiang.li2@amd.com> Signed-off-by:
Matthew Wong <Matthew.Wong2@amd.com>
-
Netanel Haber authored
Signed-off-by:Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
danisereb authored
Signed-off-by:Daniel Serebrenik <daserebrenik@nvidia.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
Nicolò Lucchesi authored
Add a new metric to track the number of requests that had their KV blocks expire. The scenario is particularly important to surface and track as it is a vital indicator of the health of the deployment. Currently we're resorting to track these failures through unstructured log parsing (which is, among other thing, error string dependent); current main: > Releasing expired KV blocks for request cmpl-071d which were retrieved by 0 decode worker(s) within 0 seconds. Signed-off-by:NickLucche <nlucches@redhat.com>
-
Daniel Mescheder authored
Signed-off-by:
Daniel Mescheder <dmesch@amazon.com> Co-authored-by:
Daniel Mescheder <dmesch@amazon.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
Yuxuan Zhang authored
Signed-off-by:
zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by:
Yuxuan Zhang <2448370773@qq.com>
-
Matt authored
Signed-off-by:Matthew Wong <Matthew.Wong2@amd.com>
-
Hyunkyun Moon authored
Signed-off-by:
HyunKyun Moon <mhg5303@gmail.com> Signed-off-by:
Hyunkyun Moon <mhg5303@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
honglyua authored
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Vadim Gimpelson authored
[BUGFIX] Fix degenerate strides in TRTLLM query tensors for FlashInfer backend. Fixes issue #32353 (#32417) Signed-off-by:Vadim Gimpelson <vadim.gimpelson@gmail.com>
-
- 18 Jan, 2026 10 commits
-
-
Iryna Boiko authored
Signed-off-by:Iryna Boiko <iboiko@habana.ai>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Deming authored
-
Andrey Khalyavin authored
Signed-off-by:Andrey Khalyavin <halyavin@yandex-team.ru>
-
Robert Shaw authored
-
bnellnm authored
Signed-off-by:Bill Nell <bnell@redhat.com>
-
tjp_zju authored
Signed-off-by:
tom-zju <tanjianpingzju1990@gmail.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Li Xie authored
Signed-off-by:xieli <xieli@stepfun.com>
-