- 28 Jan, 2026 22 commits
-
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <robshaw@redhat.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
Or Ozeri authored
Signed-off-by:
Or Ozeri <oro@il.ibm.com> Co-authored-by:
Kevin H. Luu <khluu000@gmail.com>
-
Robert Shaw authored
Signed-off-by:
Robert Shaw <robshaw@redhat.com> Signed-off-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by:
Robert Shaw <robshaw@redhat.com>
-
Kevin H. Luu authored
Signed-off-by:khluu <khluu000@gmail.com>
-
Kevin H. Luu authored
Signed-off-by:khluu <khluu000@gmail.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Yan Ma authored
-
Maryam Tahhan authored
Signed-off-by:Maryam Tahhan <mtahhan@redhat.com>
-
Gregory Shtrasberg authored
Signed-off-by:Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
-
ramos authored
Signed-off-by:
ramos <49182011+nemoramo@users.noreply.github.com> Signed-off-by:
mayufeng <mayufeng@example.com> Co-authored-by:
mayufeng <mayufeng@example.com>
-
22quinn authored
Signed-off-by:22quinn <33176974+22quinn@users.noreply.github.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Jeffrey Wang authored
Signed-off-by:Jeffrey Wang <jeffreywang@anyscale.com>
-
Micah Williamson authored
Signed-off-by:Micah Williamson <micah.williamson@amd.com>
-
Xinan Miao authored
Signed-off-by:
SouthWest7 <am1ao@qq.com> Co-authored-by:
SouthWest7 <am1ao@qq.com>
-
Harry Mellor authored
Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Ashwin Phadke <23502062+ashwin-phadke@users.noreply.github.com>
-
Angela Yi authored
Signed-off-by:angelayi <yiangela7@gmail.com>
-
Kevin H. Luu authored
Signed-off-by:khluu <khluu000@gmail.com>
-
Woosuk Kwon authored
Signed-off-by:
Woosuk Kwon <woosuk@inferact.ai> Signed-off-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Co-authored-by:
Claude <noreply@anthropic.com>
-
- 27 Jan, 2026 18 commits
-
-
Richard Zou authored
Signed-off-by:Richard Zou <zou3519@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
linhaifeng authored
Signed-off-by:linhaifeng <1371675203@qq.com>
-
Alexei-V-Ivanov-AMD authored
Signed-off-by:
DCCS-4560 <alivanov@chi-mi325x-pod1-112.ord.vultr.cpe.ice.amd.com> Co-authored-by:
DCCS-4560 <alivanov@chi-mi325x-pod1-112.ord.vultr.cpe.ice.amd.com> Co-authored-by:
TJian <tunjian.tan@embeddedllm.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Iris authored
Signed-off-by:
irisliu10 <601012173@qq.com> Signed-off-by:
Iris <38269816+irisliu10@users.noreply.github.com>
-
Karan Bansal authored
Signed-off-by:
Karan Bansal <karanb192@gmail.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
IriKa authored
Support compress-tensors with nvfp4 or fp8 weights and modelopt with nvfp4 weights on Turing (#33076) Signed-off-by:IriKa Qiu <qiujie.jq@gmail.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nickhill123@gmail.com>
-
danielafrimi authored
Signed-off-by: <dafrimi@nvidia.com> Signed-off-by:
Daniel Afrimi <dafrimi@nvidia.com> Signed-off-by:
root <dafrimi@nvidia.com>
-
danisereb authored
Signed-off-by:
Daniel Serebrenik <daserebrenik@nvidia.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com>
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com>
-
omkhalil authored
Fix UnembedMetrics to correctly count FLOPs for the unembedding (LM head) layer. The bug: UnembedMetrics used total_num_tokens() which counts all tokens in the batch for projection flops, vocab projections are run on just the last token for the autoregressive use case. Co-authored-by:Omar Mohamed Khalil <omarkhalil@meta.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
Matthew Bonanni authored
Signed-off-by:Matthew Bonanni <mbonanni@redhat.com>
-
Nicolò Lucchesi authored
Signed-off-by:NickLucche <nlucches@redhat.com>
-
omerpaz95 authored
Added queries and hits metrics for the Offloading Connector. Also added timing metrics for store and load operations, which take the average time it takes to load/store, per-token. The metrics are available from Prometheus and from the StatLogger. Signed-off-by:
omerpaz95 <omerpaz95@gmail.com> Co-authored-by:
Omer Paz <Omer.Paz@ibm.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-