- 16 Nov, 2025 3 commits
-
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
Lucia Fang authored
Signed-off-by:
Lu Fang <fanglu@fb.com> Signed-off-by:
github-actions[bot] <github-actions[bot]@users.noreply.github.com> Signed-off-by:
Lucia Fang <fanglu@fb.com> Signed-off-by:
Lucia Fang <116399278+luccafong@users.noreply.github.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com>
-
- 15 Nov, 2025 9 commits
-
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Didier Durand authored
Signed-off-by:Didier Durand <durand.didier@gmail.com>
-
Eldar Kurtić authored
Signed-off-by:Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>
-
tingtinggithub authored
Signed-off-by:tingtinggithub <streamttt@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Zhuohan Li authored
Signed-off-by:Zhuohan Li <zhuohan123@gmail.com>
-
Cyrus Leung authored
Signed-off-by:
Jialin Ouyang <Jialin.Ouyang@gmail.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by:
Jialin Ouyang <Jialin.Ouyang@gmail.com>
-
Nick Hill authored
-
Jialin Ouyang authored
[Core] Performance: Use list[np.ndarray] instead of list[list[int]] for output tokens for GC optimization (#26368) Signed-off-by:Jialin Ouyang <Jialin.Ouyang@gmail.com>
-
- 14 Nov, 2025 12 commits
-
-
rasmith authored
Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
Laith Sakka authored
Signed-off-by:Laith Sakka <lsakka@meta.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Nicolò Lucchesi authored
Signed-off-by:
NickLucche <nlucches@redhat.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Lucas Wilkinson authored
-
Yong Hoon Shin authored
Signed-off-by:
Yong Hoon Shin <yhshin@meta.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Jingchun Gao authored
Signed-off-by:
gaojc <1055866782@qq.com> Signed-off-by:
Jingchun Gao <gaojingchun1@huawei.com> Signed-off-by:
Jingchun Gao <63247409+gjc0824@users.noreply.github.com> Signed-off-by:
QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by:
gaojingchun (A) <g00955623@china.huawei.com> Co-authored-by:
Jingchun Gao <gaojingchun1@huawei.com> Co-authored-by:
QiuChunshuo <qiuchunshuo@huawei.com>
-
lyn610 authored
Add tracking and periodic logging for the number of preempted requests in the metrics logger. This helps monitor system behavior under load. Signed-off-by:Yining Liu <610lyn@gmail.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
Yan Ma authored
Signed-off-by:Yan Ma <yan.ma@intel.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 13 Nov, 2025 12 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Qiu authored
Signed-off-by:
QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
elvischenv authored
Signed-off-by:
elvischenv <219235043+elvischenv@users.noreply.github.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Yannick Schnider authored
Signed-off-by:
Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by:
Yannick Schnider <Yannick.Schnider1@ibm.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Roger Wang <hey@rogerw.io>
-
Huamin Li authored
Signed-off-by:Huamin Li <3ericli@gmail.com>
-
Pleaplusone authored
Signed-off-by:ganyi <ygan@amd.com>
-
tjandy98 authored
Signed-off-by:tjandy98 <3953059+tjandy98@users.noreply.github.com>
-
Pleaplusone authored
Signed-off-by:ganyi <ygan@amd.com>
-
Jialin Ouyang authored
Signed-off-by:Jialin Ouyang <Jialin.Ouyang@gmail.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 12 Nov, 2025 4 commits
-
-
Wei Wei authored
Signed-off-by:Wei Wei <wwei6@meta.com>
-
Andy Lo authored
Signed-off-by:Andy Lo <andy@mistral.ai>
-
alberto authored
Signed-off-by:
Alberto Perdomo <aperdomo@redhat.com> Signed-off-by:
alberto <aperdomo@redhat.com> Co-authored-by:
Or Ozeri <or@ozery.com>
-
Benjamin Chislett authored
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (#28479) Signed-off-by:Benjamin Chislett <bchislett@nvidia.com>
-