- 27 Jan, 2026 20 commits
-
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
omerpaz95 authored
Added queries and hits metrics for the Offloading Connector. Also added timing metrics for store and load operations, which take the average time it takes to load/store, per-token. The metrics are available from Prometheus and from the StatLogger. Signed-off-by:
omerpaz95 <omerpaz95@gmail.com> Co-authored-by:
Omer Paz <Omer.Paz@ibm.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
Lifan Shen authored
Signed-off-by:Lifan Shen <lifans@meta.com>
-
rasmith authored
Signed-off-by:
Randall Smith <ransmith@amd.com> Signed-off-by:
Randall Smith <Randall.Smith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
Roger Wang authored
Signed-off-by:
wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by:
wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by:
youkaichao <youkaichao@gmail.com> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
wanglinian <wanglinian@stu.pku.edu.cn> Co-authored-by:
wangln19 <96399074+wangln19@users.noreply.github.com> Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Nick Hill <nickhill123@gmail.com> Co-authored-by:
youkaichao <youkaichao@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Andreas Karatzas authored
Signed-off-by:Andreas Karatzas <akaratza@amd.com>
-
Ning Xie authored
Signed-off-by:Andy Xie <andy.xning@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Richard Zou authored
Signed-off-by:Richard Zou <zou3519@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Paco Xu authored
Signed-off-by:Paco Xu <paco.xu@daocloud.io>
-
Vincent Gimenes authored
[DOC]: Add warning about max_num_batched_tokens and max_model_len when chunked prefill is disabled (#33109) Signed-off-by:Vincent Gimenes <147169146+VincentG1234@users.noreply.github.com>
-
Strahinja Stamenkovic authored
Signed-off-by:sstamenk <strahinja.stamenkovic@amd.com>
-
wangln19 authored
Signed-off-by:
wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by:
wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Isotr0py <2037008807@qq.com>
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <robshaw@redhat.com> Signed-off-by:
Amir Klein <203507526+amirkl94@users.noreply.github.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
amirkl94 <203507526+amirkl94@users.noreply.github.com>
-
Woosuk Kwon authored
Signed-off-by:
Woosuk Kwon <woosuk@inferact.ai> Signed-off-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
- 26 Jan, 2026 20 commits
-
-
XiongfeiWei authored
Signed-off-by:
Xiongfei Wei <isaacwxf23@gmail.com> Signed-off-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
Pengchao Wang authored
Signed-off-by:Pengchao Wang <wpc@fb.com>
-
dolpm authored
Signed-off-by:dolpm <34420038+dolpm@users.noreply.github.com>
-
Jared Wen authored
Add a new CLI option --disable-access-log-for-endpoints to suppress uvicorn access logs for specified endpoints (e.g., /health, /metrics, /ping). This addresses the need to reduce log noise in production environments where health check endpoints are frequently polled by load balancers or monitoring systems, generating excessive log entries that obscure meaningful request logs. Fixes #29982 Signed-off-by:JaredforReal <w13431838023@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Kevin H. Luu authored
Signed-off-by:
Kevin H. Luu <khluu000@gmail.com> Signed-off-by:
khluu <khluu000@gmail.com> Co-authored-by:
Kevin Luu <khluu@Kevins-MacBook-Pro.local>
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
danielafrimi authored
Signed-off-by:
dafrimi <dafrimi@nvidia.com> Co-authored-by:
root <root@gpu-51.slurm-workers-slurm.slurm.svc.cluster.local>
-
Andy Lo authored
Signed-off-by:Andy Lo <andy@mistral.ai>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
Pleaplusone authored
Signed-off-by:ganyi <ygan@amd.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
Yuxuan Zhang authored
Signed-off-by:
zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
danisereb authored
Signed-off-by:
Daniel Serebrenik <daserebrenik@nvidia.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
-
VihaanThat authored
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
Itay Etelis authored
Signed-off-by:
Itay Etelis <itay.etelis@ibm.com> Co-authored-by:
Itay Etelis <itay.etelis@ibm.com>
-