- 10 Feb, 2026 1 commit
-
-
lixh authored
-
- 05 Feb, 2026 1 commit
-
-
zhuwenwen authored
-
- 04 Feb, 2026 13 commits
-
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
Michael Goin authored
Signed-off-by:Robert Shaw <rshaw@neuralmagic.com>
-
Michael Goin authored
[Bugfix] Disable RoutingMethodType.[Renormalize,RenormalizeNaive] TRTLLM per-tensor FP8 MoE (#33620) Signed-off-by:
mgoin <mgoin64@gmail.com> (cherry picked from commit e346e2d0 ) Signed-off-by:
Robert Shaw <rshaw@neuralmagic.com>
-
- 03 Feb, 2026 13 commits
-
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
zhuwenwen authored
-
Richard Zou authored
[torch.compile] Don't do the fast moe cold start optimization if there is speculative decoding (#33624) Signed-off-by:
Richard Zou <zou3519@gmail.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> (cherry picked from commit 5eac9a1b)
-
Richard Zou authored
Signed-off-by:
Richard Zou <zou3519@gmail.com> (cherry picked from commit d9aa39a3)
-
Kiersten Stokes authored
Signed-off-by:
kiersten-stokes <kierstenstokes@gmail.com> (cherry picked from commit 9e138cb0)
-
zaristei2 authored
Signed-off-by:
Zachary Aristei <zaristei@nvidia.com> Co-authored-by:
Zachary Aristei <zaristei@nvidia.com>
-
zaristei2 authored
Signed-off-by:
Zachary Aristei <zaristei@nvidia.com> Co-authored-by:
Zachary Aristei <zaristei@nvidia.com>
-
zhuwenwen authored
-
Zhewen Li authored
Signed-off-by:
zhewenli <zhewen@inferact.ai> Co-authored-by:
zhewenli <zhewen@inferact.ai>
-
- 02 Feb, 2026 12 commits
-
-
Zhewen Li authored
Signed-off-by:
zhewenli <zhewen@inferact.ai> Co-authored-by:
zhewenli <zhewen@inferact.ai>
-
Yifan Qiao authored
Signed-off-by:
Yifan Qiao <yifanqiao@berkeley.edu> (cherry picked from commit a01ef3fa)
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com> (cherry picked from commit 318b1207)
-
csy0225 authored
Signed-off-by:
Jee Jee Li <pandaleefree@gmail.com> Co-authored-by:
i-zhangmingming <i-zhangmingming@stepfun.com> Co-authored-by:
xiewuxun <xiewuxun@stepfun.com> Co-authored-by:
zetaohong <i-hongzetao@stepfun.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com> (cherry picked from commit c3b40dc3)
-
Greg Pereira authored
Signed-off-by:
greg pereira <grpereir@redhat.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> (cherry picked from commit d6416fdd)
-
René Honig authored
Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
mgoin <mgoin64@gmail.com> (cherry picked from commit 07978117)
-
Luka Govedič authored
[fix][torch.compile] Fix cold-start compilation time increase by adding kv cache update to splitting ops (#33441) Signed-off-by:
Luka Govedič <lgovedic@redhat.com> Co-authored-by:
Richard Zou <zou3519@gmail.com> (cherry picked from commit 15f40b20)
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> (cherry picked from commit 0a3c71e7)
-
Gregory Shtrasberg authored
Signed-off-by:
Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> (cherry picked from commit 31aedfe7)
-
Kevin H. Luu authored
Signed-off-by:
khluu <khluu000@gmail.com> (cherry picked from commit 2284461d)
-
Michael Goin authored
Signed-off-by:
mgoin <mgoin64@gmail.com> (cherry picked from commit bfb9bdaf)
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> (cherry picked from commit abb34ac4)
-