Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
ca683a2a
Unverified
Commit
ca683a2a
authored
Oct 14, 2025
by
Boyuan Feng
Committed by
GitHub
Oct 14, 2025
Browse files
use combo kernel to fuse qk-norm and qk-rope (#26682)
Signed-off-by:
Boyuan Feng
<
boyuan@meta.com
>
parent
e9f1b8c9
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
10 additions
and
0 deletions
+10
-0
vllm/config/compilation.py
vllm/config/compilation.py
+10
-0
No files found.
vllm/config/compilation.py
View file @
ca683a2a
...
@@ -513,6 +513,16 @@ class CompilationConfig:
...
@@ -513,6 +513,16 @@ class CompilationConfig:
if
isinstance
(
self
.
pass_config
,
dict
):
if
isinstance
(
self
.
pass_config
,
dict
):
self
.
pass_config
=
PassConfig
(
**
self
.
pass_config
)
self
.
pass_config
=
PassConfig
(
**
self
.
pass_config
)
if
(
is_torch_equal_or_newer
(
"2.9.0.dev"
)
and
"combo_kernels"
not
in
self
.
inductor_compile_config
and
"benchmark_combo_kernel"
not
in
self
.
inductor_compile_config
):
# use horizontal fusion, which is useful for fusing qk-norm and
# qk-rope when query and key have different shapes.
self
.
inductor_compile_config
[
"combo_kernels"
]
=
True
self
.
inductor_compile_config
[
"benchmark_combo_kernel"
]
=
True
# migrate the deprecated flags
# migrate the deprecated flags
if
not
self
.
use_cudagraph
:
if
not
self
.
use_cudagraph
:
logger
.
warning
(
logger
.
warning
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment