Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
c0c98b8b
Unverified
Commit
c0c98b8b
authored
Apr 17, 2026
by
Maral
Committed by
GitHub
Apr 17, 2026
Browse files
[Bugfix] Add Marlin kernel in block scaled mm kernel selection. (#40105)
Signed-off-by:
maral
<
maralbahari.98@gmail.com
>
parent
8d2cff81
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
16 additions
and
2 deletions
+16
-2
vllm/model_executor/kernels/linear/__init__.py
vllm/model_executor/kernels/linear/__init__.py
+16
-2
No files found.
vllm/model_executor/kernels/linear/__init__.py
View file @
c0c98b8b
...
...
@@ -186,12 +186,13 @@ _POSSIBLE_FP8_KERNELS: dict[PlatformEnum, list[type[FP8ScaledMMLinearKernel]]] =
# in priority/performance order (when available)
_POSSIBLE_FP8_BLOCK_KERNELS
:
dict
[
PlatformEnum
,
list
[
type
[
Fp8BlockScaledMMLinearKernel
]]
PlatformEnum
,
list
[
type
[
Fp8BlockScaledMMLinearKernel
|
FP8ScaledMMLinearKernel
]]
]
=
{
PlatformEnum
.
CUDA
:
[
FlashInferFp8DeepGEMMDynamicBlockScaledKernel
,
DeepGemmFp8BlockScaledMMKernel
,
CutlassFp8BlockScaledMMKernel
,
MarlinFP8ScaledMMLinearKernel
,
TritonFp8BlockScaledMMKernel
,
],
PlatformEnum
.
ROCM
:
[
...
...
@@ -392,6 +393,19 @@ def init_fp8_linear_kernel(
scope
=
"global"
,
)
# TODO make scaled_mm kernels inherit from MMLinearKernel
# only MarlinFP8ScaledMMLinearKernel is a type of FP8ScaledMMLinearKernel.
if
issubclass
(
kernel_type
,
FP8ScaledMMLinearKernel
):
return
kernel_type
(
scaled_mm_linear_kernel_config
,
layer_param_names
=
[
"weight"
,
"weight_scale"
,
"input_scale"
,
"input_scale_ub"
,
],
)
return
kernel_type
(
scaled_mm_linear_kernel_config
,
)
...
...
@@ -399,7 +413,7 @@ def init_fp8_linear_kernel(
else
:
kernel_type
=
choose_scaled_mm_linear_kernel
(
config
=
scaled_mm_linear_kernel_config
,
possible_kernels
=
_POSSIBLE_FP8_KERNELS
,
# type: ignore[
misc
]
possible_kernels
=
_POSSIBLE_FP8_KERNELS
,
# type: ignore[
arg-type
]
force_kernel
=
force_kernel
,
)
if
module_name
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment