- 18 Jul, 2025 8 commits
-
-
shixianc authored
Signed-off-by:
Shixian Cui <shixian@amazon.com> Co-authored-by:
Shixian Cui <shixian@amazon.com>
-
Shu Wang authored
Signed-off-by:
shuw <shuw@nvidia.com> Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
22quinn authored
Signed-off-by:22quinn <33176974+22quinn@users.noreply.github.com>
-
Lucas Wilkinson authored
-
Lucia Fang authored
Signed-off-by:Lu Fang <fanglu@fb.com>
-
Ricardo Decal authored
Signed-off-by:Ricardo Decal <rdecal@anyscale.com>
-
elvischenv authored
[Bugfix] Fix the tensor non-contiguous issue for Flashinfer TRT-LLM backend attention kernel (#21133)
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
- 17 Jul, 2025 22 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Eric Curtin authored
Signed-off-by:Eric Curtin <ecurtin@redhat.com>
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
ElizaWszola authored
Signed-off-by:ElizaWszola <ewszola@redhat.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
wangxiyuan authored
Signed-off-by:wangxiyuan <wangxiyuan1007@gmail.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
kYLe authored
Signed-off-by:
Kyle Huang <kylhuang@nvidia.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
Asher authored
Signed-off-by:Asher Zhang <asherszhang@tencent.com>
-
Varun Sundar Rabindranath authored
Signed-off-by:
Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by:
Varun Sundar Rabindranath <vsundarr@redhat.com>
-
Chauncey authored
Signed-off-by:chaunceyjiang <chaunceyjiang@gmail.com>
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
David Ben-David authored
Signed-off-by:
David Ben-David <davidb@pliops.com> Co-authored-by:
David Ben-David <davidb@pliops.com>
-
Zhonghua Deng authored
Signed-off-by:Abatom <abzhonghua@gmail.com>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
XiongfeiWei authored
Signed-off-by:Xiongfei Wei <isaacwxf23@gmail.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Kevin_Xiong authored
Signed-off-by:KevinXiong-C <kevin_xiong1997@outlook.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
QiliangCui authored
Signed-off-by:Qiliang Cui <derrhein@gmail.com>
-
- 16 Jul, 2025 10 commits
-
-
Nir David authored
Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor) (#12010) Signed-off-by:
Nir David <ndavid@habana.ai> Signed-off-by:
Uri Livne <ulivne@habana.ai> Co-authored-by:
Uri Livne <ulivne@habana.ai>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Avshalom Manevich authored
Signed-off-by:h-avsha <avshalom.manevich@hcompany.ai>
-
Mac Misiura authored
feat - add a new endpoint `get_tokenizer_info` to provide tokenizer/chat-template information (#20575) Signed-off-by:m-misiura <mmisiura@redhat.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Michael Yao authored
Signed-off-by:windsonsea <haifeng.yao@daocloud.io>
-
Seiji Eicher authored
Signed-off-by:Seiji Eicher <seiji@anyscale.com>
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Chengji Yao authored
Signed-off-by:Chengji Yao <chengjiyao@google.com>
-