- 29 Apr, 2025 2 commits
-
-
Qiming Zhang authored
Signed-off-by:mayuyuace <qiming1.zhang@intel.com>
-
a2q1p authored
-
- 28 Apr, 2025 5 commits
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Simon Mo authored
Signed-off-by:simon-mo <xmo@berkeley.edu>
-
Charlie Fu authored
Signed-off-by:charlifu <charlifu@amd.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 26 Apr, 2025 5 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Kero Liang authored
Signed-off-by:imkero <kerorek@outlook.com>
-
Agata Dobrzyniewicz authored
Signed-off-by:Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
-
Charlie Fu authored
Signed-off-by:charlifu <charlifu@amd.com>
-
rasmith authored
Signed-off-by:Randall Smith <Randall.Smith@amd.com>
-
- 25 Apr, 2025 4 commits
-
-
rasmith authored
[Quantization][FP8] Add support for FP8 models with input_scale for output projection and QK quantization (#15734) Signed-off-by:
Randall Smith <Randall.Smith@amd.com> Signed-off-by:
Luka Govedič <lgovedic@redhat.com> Co-authored-by:
Luka Govedič <lgovedic@redhat.com>
-
yexin(叶鑫) authored
[Perf]Optimize rotary_emb implementation to use Triton operator for improved inference performance (#16457) Signed-off-by:
cynthieye <yexin93@qq.com> Co-authored-by:
MagnetoWang <magnetowang@outlook.com>
-
Mengqing Cao authored
Signed-off-by:Mengqing Cao <cmq0113@163.com>
-
vllmellm authored
Signed-off-by:
vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by:
tjtanaa <tunjian.tan@embeddedllm.com>
-
- 22 Apr, 2025 4 commits
-
-
Lei Wang authored
Signed-off-by:
xinyuxiao <xinyuxiao2024@gmail.com> Co-authored-by:
xinyuxiao <xinyuxiao2024@gmail.com>
-
Charlie Fu authored
Signed-off-by:
charlifu <charlifu@amd.com> Co-authored-by:
Tyler Michael Smith <tysmith@redhat.com>
-
Varun Sundar Rabindranath authored
Signed-off-by:
varun sundar rabindranath <vsundarr@redhat.com> Co-authored-by:
varun sundar rabindranath <vsundarr@redhat.com>
-
kliuae authored
Signed-off-by:
kliuae <kuanfu.liu@embeddedllm.com> Signed-off-by:
tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by:
tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by:
vllmellm <vllm.ellm@embeddedllm.com>
-
- 21 Apr, 2025 1 commit
-
-
Chanh Nguyen authored
Signed-off-by:
Chanh Nguyen <cnguyen@linkedin.com> Co-authored-by:
Chanh Nguyen <cnguyen@linkedin.com>
-
- 19 Apr, 2025 2 commits
-
-
Divakar Verma authored
Signed-off-by:Divakar Verma <divakar.verma@amd.com>
-
Yang Fan authored
Signed-off-by:
fyabc <suyang.fy@alibaba-inc.com> Signed-off-by:
Roger Wang <ywang@roblox.com> Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by:
Roger Wang <ywang@roblox.com> Co-authored-by:
Xiong Wang <wangxiongts@163.com>
-
- 18 Apr, 2025 2 commits
-
-
wang.yuqi authored
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkinson@neuralmagic.com>
-
- 17 Apr, 2025 2 commits
-
-
Sijia(Jackson) Chen authored
-
Ximingwang-09 authored
Signed-off-by:
ximing.wxm <ximing.wxm@antgroup.com> Co-authored-by:
ximing.wxm <ximing.wxm@antgroup.com>
-
- 15 Apr, 2025 2 commits
-
-
Dipika Sikka authored
-
Jinzhen Lin authored
Signed-off-by:
Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by:
Michael Goin <michael@neuralmagic.com> Co-authored-by:
mgoin <mgoin64@gmail.com>
-
- 13 Apr, 2025 1 commit
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 12 Apr, 2025 1 commit
-
-
wang.yuqi authored
-
- 11 Apr, 2025 3 commits
-
-
Yong Hoon Shin authored
-
Michael Goin authored
[Kernel] Support W8A8 channel-wise weights and per-token activations in triton fused_moe_kernel (#16366) Signed-off-by:mgoin <mgoin64@gmail.com>
-
chaow-amd authored
Signed-off-by:chaow <chaow@amd.com>
-
- 10 Apr, 2025 2 commits
-
-
Chih-Chieh Yang authored
Signed-off-by:
Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com> Co-authored-by:
Yu Chin Fabian Lim <flim@sg.ibm.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 09 Apr, 2025 2 commits
-
-
yihong authored
Signed-off-by:yihong0618 <zouzou0208@gmail.com>
-
TJian authored
[Bug] [ROCm] Fix Llama 4 Enablement Bug on ROCm: V0 ROCmFlashAttentionImpl and Triton Fused MoE bugs (#16198) Signed-off-by:
tjtanaa <tunjian.tan@embeddedllm.com> Signed-off-by:
kliuae <kuanfu.liu@embeddedllm.com> Co-authored-by:
Hongxia Yang <hongxia.yang@amd.com> Co-authored-by:
kliuae <kuanfu.liu@embeddedllm.com>
-
- 08 Apr, 2025 2 commits
-
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
zxfan-cpu authored
-