- 05 Jun, 2025 3 commits
- 04 Jun, 2025 1 commit
-
-
zhuwenwen authored
-
- 29 May, 2025 1 commit
-
-
gaoqiong authored
-
- 28 May, 2025 1 commit
-
-
王敏 authored
-
- 27 May, 2025 2 commits
- 23 May, 2025 1 commit
-
-
lizhigong authored
-
- 22 May, 2025 2 commits
- 16 May, 2025 1 commit
-
-
lizhigong authored
-
- 13 May, 2025 1 commit
-
-
zhuwenwen authored
support telechat2 and glm4 nn layout remove log of request_id
-
- 09 May, 2025 6 commits
- 08 May, 2025 1 commit
-
-
zhuwenwen authored
-
- 30 Apr, 2025 2 commits
- 28 Apr, 2025 5 commits
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Simon Mo authored
Signed-off-by:simon-mo <xmo@berkeley.edu>
-
Charlie Fu authored
Signed-off-by:charlifu <charlifu@amd.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 26 Apr, 2025 5 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Kero Liang authored
Signed-off-by:imkero <kerorek@outlook.com>
-
Agata Dobrzyniewicz authored
Signed-off-by:Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
-
Charlie Fu authored
Signed-off-by:charlifu <charlifu@amd.com>
-
rasmith authored
Signed-off-by:Randall Smith <Randall.Smith@amd.com>
-
- 25 Apr, 2025 5 commits
-
-
rasmith authored
[Quantization][FP8] Add support for FP8 models with input_scale for output projection and QK quantization (#15734) Signed-off-by:
Randall Smith <Randall.Smith@amd.com> Signed-off-by:
Luka Govedič <lgovedic@redhat.com> Co-authored-by:
Luka Govedič <lgovedic@redhat.com>
-
gaoqiong authored
-
yexin(叶鑫) authored
[Perf]Optimize rotary_emb implementation to use Triton operator for improved inference performance (#16457) Signed-off-by:
cynthieye <yexin93@qq.com> Co-authored-by:
MagnetoWang <magnetowang@outlook.com>
-
Mengqing Cao authored
Signed-off-by:Mengqing Cao <cmq0113@163.com>
-
vllmellm authored
Signed-off-by:
vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by:
tjtanaa <tunjian.tan@embeddedllm.com>
-
- 23 Apr, 2025 1 commit
-
-
yangql authored
-
- 22 Apr, 2025 2 commits
-
-
gaoqiong authored
-
Lei Wang authored
Signed-off-by:
xinyuxiao <xinyuxiao2024@gmail.com> Co-authored-by:
xinyuxiao <xinyuxiao2024@gmail.com>
-