- 26 Aug, 2025 17 commits
-
-
Huy Do authored
Signed-off-by:Huy Do <huydhn@gmail.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Roger Wang authored
Signed-off-by:
Roger Wang <hey@rogerw.me> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
Roger Wang <hey@rogerw.me> Co-authored-by:
knlnguyen1802 <knlnguyen1802@gmail.com>
-
Jee Jee Li authored
Signed-off-by:
Jee Jee Li <pandaleefree@gmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
-
Raghavan authored
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Jiangyun Zhu authored
Signed-off-by:zjy0516 <riverclouds.zhu@qq.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Bin Jia authored
Signed-off-by:jiabin.00 <jiabin.00@bytedance.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Zijing Liu authored
[Disagg][Perf] Use CUDA event sync instead of blocking `tolist` to avoid unintentional copy ops blocking across different CUDA streams, improving disagg TTIT/TTFT (#22760) Signed-off-by:
Zijing Liu <liuzijing2014@gmail.com> Signed-off-by:
Zijing Liu <liuzijing2014@users.noreply.github.com>
-
Copilot authored
Signed-off-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by:
copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by:
ProExpertProg <11367180+ProExpertProg@users.noreply.github.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
weiliang authored
Signed-off-by:
Siyuan Fu <siyuanf@nvidia.com> Signed-off-by:
siyuanf <siyuanf@nvidia.com> Signed-off-by:
Weiliang Liu <weiliangl@nvidia.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Siyuan Fu <siyuanf@nvidia.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 25 Aug, 2025 23 commits
-
-
Simon Mo authored
Signed-off-by:simon-mo <simon.mo@hey.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk@thinkingmachines.ai>
-
Terrence Zhao authored
Signed-off-by:
Terrencezzj <terrence@cohere.ai> Signed-off-by:
Abatom <abzhonghua@gmail.com> Co-authored-by:
Zhonghua Deng <abzhonghua@gmail.com>
-
Pate Motter authored
Signed-off-by:Pate Motter <patemotter@google.com>
-
Chaojun Zhang authored
Signed-off-by:chzhang <chaojun.zhang@intel.com>
-
Zhonghua Deng authored
[Bugfix][V1][P/D]Fix the issue where repeated requests for the same input produce abnormal outputs for P2pNcclConnector (#23403) Signed-off-by:Abatom <abzhonghua@gmail.com>
-
Xin Yang authored
Signed-off-by:
Xin Yang <xyangx@amazon.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
22quinn authored
Signed-off-by:22quinn <33176974+22quinn@users.noreply.github.com>
-
Woosuk Kwon authored
Signed-off-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Isotr0py authored
Signed-off-by:Isotr0py <mozf@mail2.sysu.edu.cn>
-
Driss Guessous authored
Signed-off-by:drisspg <drisspguessous@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Ayush Satyam authored
Signed-off-by:Ayush Satyam <ayushsatyam146@gmail.com>
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Breno Baldas Skuk authored
Signed-off-by:breno.skuk <breno.skuk@hcompany.ai>
-
ZiTian Zhao authored
Signed-off-by:zitian.zhao <zitian.zhao@tencentmusic.com>
-
Chenguang Zheng authored
[Core][Multimodal] Track encode cache entries by mm_hash and enable embedding sharing between requests (#22711) Signed-off-by:
knlnguyen1802 <knlnguyen1802@gmail.com> Signed-off-by:
Roger Wang <hey@rogerw.io> Co-authored-by:
knlnguyen1802 <knlnguyen1802@gmail.com> Co-authored-by:
Roger Wang <hey@rogerw.io>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Yu Guo authored
Signed-off-by:Yu Guo <yuguo@meta.com>
-
LIYIFAN_liyifan authored
[Bugfix] Fix Dense module loading for sentence-transformers embedding models (simplified V2) (#23408) Signed-off-by:FFFfff1FFFfff <yifanli0919@gmail.com>
-
Benji Beck authored
Signed-off-by:Benji Beck <benjibeck@meta.com>
-