- 07 Aug, 2025 6 commits
-
-
Lucas Wilkinson authored
[Attention] Support multiple attention metadata builders per kv_cache_spec + proper local attention no hybrid kv cache fix (#21588) Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
tc-mb authored
Co-authored-by:imning3 <hbning@pku.edu.cn>
-
Lain authored
Signed-off-by:
Siyuan Fu <siyuanf@nvidia.com> Signed-off-by:
Lain <fusiyuan2000@hotmail.com> Signed-off-by:
Yongye Zhu <zyy1102000@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Yongye Zhu <zyy1102000@gmail.com>
-
Yongye Zhu authored
Signed-off-by:Yongye Zhu <zyy1102000@gmail.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Asaf Joseph Gardin authored
Signed-off-by:
asafg <asafg@ai21.com> Co-authored-by:
asafg <asafg@ai21.com>
-
- 06 Aug, 2025 5 commits
-
-
Yongye Zhu authored
Signed-off-by:
simon-mo <xmo@berkeley.edu> Signed-off-by:
Yongye Zhu <zyy1102000@gmail.com> Co-authored-by:
simon-mo <xmo@berkeley.edu>
-
Chen Zhang authored
Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Co-authored-by:
LiuXiaoxuanPKU <lilyliupku@gmail.com> Co-authored-by:
simon-mo <xmo@berkeley.edu> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Co-authored-by:
Minseok Lee <47620120+minseokl@users.noreply.github.com> Co-authored-by:
Yongye Zhu <zyy1102000@gmail.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Harry Mellor authored
Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by:
Isotr0py <2037008807@qq.com> Signed-off-by:
isotr0py <2037008807@qq.com> Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
- 05 Aug, 2025 7 commits
-
-
Benji Beck authored
Signed-off-by:
Benji Beck <benjibeck@meta.com> Co-authored-by:
Isotr0py <2037008807@qq.com>
-
wang.yuqi authored
Signed-off-by:wang.yuqi <noooop@126.com>
-
ZiTian.Zhao authored
Signed-off-by:zitian zhao <zitian.zhao@tencentmusic.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Yuxuan Zhang authored
Signed-off-by:zRzRzRzRzRzRzR <2448370773@qq.com>
-
TJian authored
Signed-off-by:tjtanaa <tunjian.tan@embeddedllm.com>
-
Po-Han Huang (NVIDIA) authored
-
- 04 Aug, 2025 4 commits
-
-
Raghav Ravishankar authored
Signed-off-by:alyosha-swamy <raghav@arcee.ai>
-
Weixiao Huang authored
Signed-off-by:huangweixiao <huangweixiao@msh.team>
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
Chenxi Yang authored
Co-authored-by:Chenxi Yang <cxyang@meta.com>
-
- 03 Aug, 2025 4 commits
-
-
Yuxuan Zhang authored
Signed-off-by:zRzRzRzRzRzRzR <2448370773@qq.com>
-
Li, Jiang authored
[CI/Build][Bugfix] Fix Qwen2.5 tests in CPU CI via fallback silu_and_mul to torch native implementation (#22145) Signed-off-by:jiang1.li <jiang1.li@intel.com>
-
Isotr0py authored
Signed-off-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by:
Isotr0py <2037008807@qq.com>
-
jiahanc authored
Signed-off-by:
jiahanc <173873397+jiahanc@users.noreply.github.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
- 02 Aug, 2025 9 commits
-
-
Yan Ma authored
Signed-off-by:
yan <yan.ma@intel.com> Signed-off-by:
Yan Ma <yan.ma@intel.com>
-
Chih-Chieh Yang authored
Signed-off-by:
Chih-Chieh Yang <7364402+cyang49@users.noreply.github.com> Signed-off-by:
Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com>
-
Yuxuan Zhang authored
Signed-off-by:
Isotr0py <2037008807@qq.com> Signed-off-by:
zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by:
Isotr0py <2037008807@qq.com>
-
Chih-Chieh Yang authored
Signed-off-by:Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com>
-
vllmellm authored
[FEAT][ROCm] Enable running Flash Attention as ViT attn backend for Qwen-VL models on ROCm platform. (#22069) Signed-off-by:
tjtanaavllm <tunjian.tan@amd.com> Signed-off-by:
vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by:
tjtanaavllm <tunjian.tan@amd.com>
-
Dipika Sikka authored
Signed-off-by:Dipika Sikka <dipikasikka1@gmail.com>
-
Varun Sundar Rabindranath authored
Signed-off-by:
Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by:
Varun Sundar Rabindranath <vsundarr@redhat.com>
-
vllmellm authored
Signed-off-by:
kf <kuanfu.liu@embeddedllm.com> Signed-off-by:
vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by:
kf <kuanfu.liu@embeddedllm.com>
-
JartX authored
-
- 01 Aug, 2025 5 commits
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Varun Sundar Rabindranath authored
Signed-off-by:
Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by:
Varun Sundar Rabindranath <vsundarr@redhat.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
Dipika Sikka authored
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-