- 31 Jul, 2025 1 commit
-
-
zhuwenwen authored
增加fused moe文件中w4a8的相关修改 fix: 修复W8A8读config路径错误,删除int8_utils.py文件 fix: 修复W8A8INT8读config问题 修改W4A8 以及W8A8量化量化092接口
-
- 02 Jul, 2025 1 commit
-
-
bnellnm authored
Signed-off-by:
Bill Nell <bnell@redhat.com> Signed-off-by:
ElizaWszola <ewszola@redhat.com> Co-authored-by:
ElizaWszola <ewszola@redhat.com>
-
- 13 Jun, 2025 1 commit
-
-
gaoqiong authored
-
- 03 Jun, 2025 1 commit
-
-
Simon Mo authored
Signed-off-by:simon-mo <simon.mo@hey.com>
-
- 14 May, 2025 1 commit
-
-
bnellnm authored
-
- 24 Apr, 2025 1 commit
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 23 Apr, 2025 1 commit
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 11 Apr, 2025 1 commit
-
-
Michael Goin authored
[Kernel] Support W8A8 channel-wise weights and per-token activations in triton fused_moe_kernel (#16366) Signed-off-by:mgoin <mgoin64@gmail.com>
-