- 21 Nov, 2025 2 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Wentao Ye authored
[Feature] Shared Experts Overlap with FI deepgemm swap kernel, 2.2% throughput improvement and 3.6% TTFT improvement (#28879) Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 20 Nov, 2025 3 commits
-
-
Anna Shors authored
Signed-off-by:
ashors1 <ashors@nvidia.com> Co-authored-by:
Chen Zhang <zhangch99@outlook.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Shengliang Xu authored
Signed-off-by:Shengliang Xu <shengliangx@nvidia.com>
-
- 19 Nov, 2025 16 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
JartX authored
-
Max Hu authored
Signed-off-by:Max Hu <hyoung2991@gmail.com>
-
Yongye Zhu authored
Signed-off-by:Yongye Zhu <zyy1102000@gmail.com>
-
Shu Wang authored
Signed-off-by:
Shu Wang. <shuw@nvidia.com> Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Qiu authored
Signed-off-by:
QiuChunshuo <qiuchunshuo@huawei.com> Signed-off-by:
FENP <yuanyongjie.yyj@antgroup.com> Signed-off-by:
LookAround <lixushi@huawei.com> Signed-off-by:
Jingchun Gao <gaojingchun1@huawei.com> Signed-off-by:
zhenwenqi2024 <zhenwenqi_2022@qq.com> Co-authored-by:
FENP <yuanyongjie.yyj@antgroup.com> Co-authored-by:
LookAround <lixushi@huawei.com> Co-authored-by:
Jingchun Gao <gaojingchun1@huawei.com> Co-authored-by:
zhenwenqi2024 <zhenwenqi_2022@qq.com> Co-authored-by:
Jingchun Gao <63247409+gjc0824@users.noreply.github.com>
-
杰兮 authored
Signed-off-by:
zhyajie <yajizhan@amd.com> Co-authored-by:
zhyajie <yajizhan@amd.com>
-
Robert Shaw authored
Signed-off-by:Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Shanshan Shen authored
[Model][Mamba] Add selector for mamba attention backend and make it pluggable for other device (#26487) Signed-off-by:shen-shanshan <467638484@qq.com>
-
Chen Bruce authored
Signed-off-by:
bruceszchen <bruceszchen@tencent.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Xin Yang authored
Signed-off-by:Xin Yang <xyangx@amazon.com>
-
Li, Jiang authored
Signed-off-by:jiang1.li <jiang1.li@intel.com>
-
tomeras91 authored
[Hybrid][torch.compile] Refactor mamba2 forward to avoid obscuring linear projections under custom op (#28587) Signed-off-by:Tomer Asida <57313761+tomeras91@users.noreply.github.com>
-
Varun Sundar Rabindranath authored
Signed-off-by:
Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by:
Varun Sundar Rabindranath <vsundarr@redhat.com>
-
- 18 Nov, 2025 5 commits
-
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Luciano Martins authored
Signed-off-by:
Luciano Martins <lucianommartins@users.noreply.github.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Luciano Martins <lucianommartins@users.noreply.github.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
Canlin Guo authored
Signed-off-by:gcanlin <canlinguosdu@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
xuebwang-amd authored
Signed-off-by:
xuebwang-amd <xuebwang@amd.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
- 17 Nov, 2025 2 commits
-
-
Zhewen Li authored
Signed-off-by:zhewenli <zhewenli@meta.com>
-
jiahanc authored
Signed-off-by:
jiahanc <173873397+jiahanc@users.noreply.github.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
-
- 16 Nov, 2025 1 commit
-
-
amirkl94 authored
-
- 15 Nov, 2025 2 commits
-
-
Zhewen Li authored
Signed-off-by:Zhewen Li <zhewenli@meta.com>
-
Varun Sundar Rabindranath authored
Signed-off-by:
Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by:
Varun Sundar Rabindranath <vsundarr@redhat.com>
-
- 14 Nov, 2025 8 commits
-
-
Thomas Parnell authored
Signed-off-by:Thomas Parnell <tpa@zurich.ibm.com>
-
Alexander Matveev authored
[Bugfix] Fix incorrect use of hidden_states for shared_experts due to do_naive_dispatch_combine (#28740) Signed-off-by:Alexander Matveev <amatveev@redhat.com>
-
Andrey Khalyavin authored
Signed-off-by:Andrey Khalyavin <halyavin@yandex-team.ru>
-
TJian authored
Signed-off-by:tjtanaa <tunjian.tan@embeddedllm.com>
-
Duncan Moss authored
Signed-off-by:
Duncan Moss <djm.moss@gmail.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
Shanshan Shen authored
Signed-off-by:
shen-shanshan <467638484@qq.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
haoyangli-amd authored
Signed-off-by:Haoyang Li <lihaoyang0109@gmail.com>
-
Hank_ authored
Signed-off-by:
Hank <hcc.mayday@gmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
-
- 13 Nov, 2025 1 commit
-
-
Varun Sundar Rabindranath authored
Signed-off-by:
Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by:
Varun Sundar Rabindranath <vsundarr@redhat.com>
-