- 27 Apr, 2025 7 commits
-
-
Alex Brooks authored
Signed-off-by:
Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com>
-
Flex Wang authored
[Misc] Change buckets of histogram_iteration_tokens to [1, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8096] to represent number of tokens (#17033) Signed-off-by:sfc-gh-zhwang <flex.wang@snowflake.com>
-
Jade Zheng authored
Signed-off-by:Jade Zheng <zheng.shoujian@outlook.com>
-
Chen Zhang authored
Signed-off-by:Chen Zhang <zhangch99@outlook.com>
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
rasmith authored
[Kernel][Triton][FP8] Adding fp8 and variable length sequence support to Triton FAv2 kernel (#12591) Signed-off-by:Randall Smith <Randall.Smith@amd.com>
-
- 26 Apr, 2025 21 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Kero Liang authored
Signed-off-by:imkero <kerorek@outlook.com>
-
Ning Xie authored
-
Lu Fang authored
Signed-off-by:Lu Fang <lufang@fb.com>
-
changjun.lee authored
[Bugfix] fix error due to an uninitialized tokenizer when using `skip_tokenizer_init` with `num_scheduler_steps` (#9276) Signed-off-by:changjun.lee <pord7457@gmail.com>
-
Aaron Pham authored
Signed-off-by:Aaron Pham <contact@aarnphm.xyz>
-
Ning Xie authored
Signed-off-by:Andy Xie <andy.xning@gmail.com>
-
Russell Bryant authored
-
Agata Dobrzyniewicz authored
Signed-off-by:Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
Zijing Liu authored
Signed-off-by:
Zijing Liu <liuzijing2014@gmail.com> Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Mark McLoughlin <markmc@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Charlie Fu authored
Signed-off-by:charlifu <charlifu@amd.com>
-
Shu Wang authored
Signed-off-by:shuw <shuw@nvidia.com>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
James Wu authored
Signed-off-by:James Wu <jjwu@meta.com>
-
Yihua Cheng authored
-
rasmith authored
Signed-off-by:Randall Smith <Randall.Smith@amd.com>
-
Chen Zhang authored
Signed-off-by:Chen Zhang <zhangch99@outlook.com>
-
- 25 Apr, 2025 12 commits
-
-
Benjamin Chislett authored
Signed-off-by:
Bryan Lu <yuzhelu@amazon.com> Signed-off-by:
Benjamin Chislett <benjamin.chislett@centml.ai> Co-authored-by:
Bryan Lu <yuzhelu@amazon.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-
Christian Heimes authored
Signed-off-by:Christian Heimes <christian@python.org>
-
Daniel Li authored
-
Russell Bryant authored
Signed-off-by:
Shangming Cai <caishangming@linux.alibaba.com> Co-authored-by:
Shangming Cai <caishangming@linux.alibaba.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Jasmond L authored
Signed-off-by:
Jasmond Loh <Jasmond.Loh@hotmail.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
Lu Fang authored
-
rasmith authored
[Quantization][FP8] Add support for FP8 models with input_scale for output projection and QK quantization (#15734) Signed-off-by:
Randall Smith <Randall.Smith@amd.com> Signed-off-by:
Luka Govedič <lgovedic@redhat.com> Co-authored-by:
Luka Govedič <lgovedic@redhat.com>
-
Sangyeon Cho authored
Signed-off-by:
csy1204 <josang1204@gmail.com> Co-authored-by:
조상연[플레이스 AI] <sang-yeon.cho@navercorp.com>
-