- 10 Mar, 2026 20 commits
-
-
Jiangyun Zhu authored
Signed-off-by:zjy0516 <riverclouds.zhu@qq.com>
-
Alvin Tang authored
Signed-off-by:
gambletan <ethanchang32@gmail.com> Co-authored-by:
gambletan <ethanchang32@gmail.com> Co-authored-by:
Claude Opus 4.6 <noreply@anthropic.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
SoluMilken authored
Signed-off-by:SoluMilken <ypiheyn.imm02g@g2.nctu.edu.tw>
-
Raushan Turganbay authored
Signed-off-by:
raushan <raushan@huggingface.co> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Vadim Gimpelson authored
Signed-off-by:Vadim Gimpelson <vadim.gimpelson@gmail.com>
-
Chang Su authored
feat(grpc): extract gRPC servicer into smg-grpc-servicer package, add --grpc flag to vllm serve (#36169) Signed-off-by:
Chang Su <chang.s.su@oracle.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
amirkl94 authored
Signed-off-by:Amir Klein <203507526+amirkl94@users.noreply.github.com>
-
hallerite authored
[Bugfix] Fix `RuntimeError: Already borrowed` that degrades VLM serving throughput under concurrent load. (#36557) Signed-off-by:
hallerite <hallerite@users.noreply.github.com> Signed-off-by:
hallerite <git@hallerite.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Zhuohan Li authored
-
Wentao Ye authored
[Perf] Compute maxsim in worker side, reducing redundant copies, 2.7% E2E throughput improvement (#36159) Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Hojin Yang authored
Signed-off-by:
effortprogrammer <yhjhoward7@gmail.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Ajay Anubolu authored
Signed-off-by:AjAnubolu <anuboluajay@gmail.com>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
- 09 Mar, 2026 20 commits
-
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Shaun Kotek authored
Signed-off-by:
Shaun Kotek - Nvidia <skotek@nvidia.com> Co-authored-by:
root <root@gpu-259.slurm-workers-slurm.slurm.svc.cluster.local>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Micah Williamson authored
Signed-off-by:Micah Williamson <micah.williamson@amd.com>
-
Lucas Kabela authored
Signed-off-by:Lucas Kabela <lucaskabela@meta.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Simon Mo authored
Co-authored-by:Cursor Agent <cursoragent@cursor.com>
-
Taneem Ibrahim authored
[Misc] Refactored 5 duplicate helper functions that were copied-pasted across multiple parsers (#36436) Signed-off-by:Taneem Ibrahim <taneem.ibrahim@gmail.com>
-
Copilot authored
Signed-off-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by:
copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by:
ProExpertProg <11367180+ProExpertProg@users.noreply.github.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
-
Shaun Kotek authored
Signed-off-by:
Shaun Kotek - Nvidia <skotek@nvidia.com> Signed-off-by:
Natan Bagrov <nbagrov@nvidia.com> Signed-off-by:
Daniel Serebrenik <daserebrenik@nvidia.com> Signed-off-by:
zjy0516 <riverclouds.zhu@qq.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
yewentao256 <zhyanwentao@126.com> Signed-off-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by:
liweiguang <codingpunk@gmail.com> Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com> Signed-off-by:
Alex Brooks <albrooks@redhat.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by:
cong-or <conchubhar.gannon@gmail.com> Signed-off-by:
Tushar Shetty <tushar.shetty@abbyy.com> Signed-off-by:
Tushar Shetty <54362365+tusharshetty61@users.noreply.github.com> Signed-off-by:
jiang1.li <jiang1.li@intel.com> Signed-off-by:
zhenwei-intel <zhenwei.liu@intel.com> Signed-off-by:
Xin Yang <xyangx@amazon.com> Signed-off-by:
Kevin H. Luu <khluu000@gmail.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
nvnbagrov <nbagrov@nvidia.com> Co-authored-by:
Sage <80211083+sagearc@users.noreply.github.com> Co-authored-by:
danisereb <daserebrenik@nvidia.com> Co-authored-by:
Jiangyun Zhu <riverclouds.zhu@qq.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com> Co-authored-by:
copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Weiguang Li <codingpunk@gmail.com> Co-authored-by:
Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by:
Li, Jiang <jiang1.li@intel.com> Co-authored-by:
wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by:
Alex Brooks <albrooks@redhat.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by:
cong-or <conchubhar.gannon@gmail.com> Co-authored-by:
Tushar Shetty <54362365+tusharshetty61@users.noreply.github.com> Co-authored-by:
liuzhenwei <zhenwei.liu@intel.com> Co-authored-by:
Xin Yang <105740670+xyang16@users.noreply.github.com> Co-authored-by:
Kevin H. Luu <khluu000@gmail.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
Russell Bryant authored
Signed-off-by:Russell Bryant <rbryant@redhat.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@inferact.ai>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
Andreas Karatzas authored
[ROCm][CI] Fix ROCm attention backend validation for head sizes, block sizes, and compute capability checks (#36292) Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
SoluMilken authored
Signed-off-by:SoluMilken <ypiheyn.imm02g@g2.nctu.edu.tw>
-
Roberto L. Castro authored
[Attention][Perf][Kernel] Replace torch.cat with vectorized CUDA kernel MLA query concat - DeepSeek-V3.2 (#34917) Signed-off-by:
LopezCastroRoberto <rocastro@redhat.com> Signed-off-by:
Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
-
Roberto L. Castro authored
Signed-off-by:
LopezCastroRoberto <rocastro@redhat.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
Taoyu Zhu authored
Signed-off-by:zhutaoyu <zhutaoyu97@gmail.com>
-