- 09 Apr, 2026 29 commits
-
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
Lucas Kabela authored
[Performance Improvement] Update `batched_count_greater_than` to handle batch size 1 without recompile (#38933) Signed-off-by:
Lucas Kabela <lucaskabela@meta.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Richard Zou authored
Signed-off-by:Richard Zou <zou3519@gmail.com>
-
lalit10 authored
Signed-off-by:
Lalit Laxminarayan Bangad <lalitbangad@gmail.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by:
DarkLight1337 <tlleungac@connect.ust.hk>
-
Andrii Skliar authored
Signed-off-by:
Andrii Skliar <askliar@nvidia.com> Co-authored-by:
Andrii Skliar <askliar@nvidia.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
Shengqi Chen authored
Signed-off-by:Shengqi Chen <harry-chen@outlook.com>
-
Qidong Su authored
Signed-off-by:
Qidong Su <soodoshll@gmail.com> Co-authored-by:
Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-
Andrew Barnes authored
Signed-off-by:Bortlesboat <bortstheboat@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Zhewen Li authored
-
sihao_li authored
Signed-off-by:
sihao.li <sihao.li@intel.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
Khairul Kabir authored
Signed-off-by:
khairulkabir1661 <khairulkabir1661@users.noreply.github.com> Co-authored-by:
khairulkabir1661 <khairulkabir1661@users.noreply.github.com>
-
Yongye Zhu authored
-
Chendi.Xue authored
Signed-off-by:Chendi Xue <chendi.xue@intel.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
noobHappylife authored
Signed-off-by:
noobhappylife <aratar1991@hotmail.com> Co-authored-by:
OpenAI Codex <codex@openai.com>
-
Ilya Boytsov authored
Signed-off-by:Ilya Boytsov <ilyaboytsov1805@gmail.com>
-
Wei Zhao authored
Signed-off-by:
wzhao18 <wzhao18.sz@gmail.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com>
-
Ajay Anubolu authored
Signed-off-by:
AjAnubolu <anuboluajay@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
-
Dipika Sikka authored
Signed-off-by:Dipika Sikka <dipikasikka1@gmail.com>
-
Michael Goin authored
Signed-off-by:
mgoin <mgoin64@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
liuzhenwei authored
Signed-off-by:
zhenwei-intel <zhenwei.liu@intel.com> Signed-off-by:
Kunshang Ji <jikunshang95@gmail.com> Signed-off-by:
Kunshang Ji <kunshang.ji@intel.com> Co-authored-by:
Kunshang Ji <jikunshang95@gmail.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com>
-
Maral authored
[W8A8 Block Linear Refactor][2/N] Remove W8A8Fp8BlockLinearOp and adopt Fp8 block linear kernel selections. (#33892) Signed-off-by:
maral <maralbahari.98@gmail.com> Signed-off-by:
Maral <maralbahari.98@gmail.com>
-
Benjamin Chislett authored
Signed-off-by:
Benjamin Chislett <bchislett@nvidia.com> Signed-off-by:
Benjamin Chislett <chislett.ben@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
- 08 Apr, 2026 11 commits
-
-
Richard Zou authored
Signed-off-by:Richard Zou <zou3519@gmail.com>
-
Kai Song authored
Signed-off-by:Song Kai <songkai05@baidu.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
triangleXIV authored
[BugFix] --max-model-len=-1 causes over-limit requests to hang and starve the entire service (#39102) Signed-off-by:
triangle14 <y1019026570@gmail.com> Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
Rishi Puri authored
Signed-off-by:
Rishi Puri <riship@nvidia.com> Signed-off-by:
Rishi Puri <puririshi98@berkeley.edu> Signed-off-by:
sfeng33 <4florafeng@gmail.com> Co-authored-by:
Benjamin Chislett <chislett.ben@gmail.com> Co-authored-by:
Flora Feng <4florafeng@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Jackmin801 authored
Signed-off-by:
Jackmin801 <ongjackm@gmail.com> Signed-off-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
Ben Browning authored
Signed-off-by:
Ben Browning <bbrownin@redhat.com> Co-authored-by:
Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by:
Cursor <cursoragent@cursor.com>
-
Lain authored
Signed-off-by:
Siyuan Fu <siyuanf@nvidia.com> Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Lucas Wilkinson <lwilkins@redhat.com>
-
Roberto L. Castro authored
[Perf][Kernel] Persistent TopK scheduler: unified CUDAGraph-safe kernel with dynamic per-row dispatch - DeepSeek-V3.2 DSA decode (#37421) Signed-off-by:
LopezCastroRoberto <rocastro@redhat.com> Signed-off-by:
Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com> Co-authored-by:
Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
-
Shengqi Chen authored
Signed-off-by:
Shengqi Chen <harry-chen@outlook.com> Co-authored-by:
Jason Li <jasonlizhengjian@gmail.com> Co-authored-by:
Roger Wang <hey@rogerw.io>
-