Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e3a3e4db
Unverified
Commit
e3a3e4db
authored
Jun 19, 2025
by
qli88
Committed by
GitHub
Jun 20, 2025
Browse files
[Bugfix] Enable PP with AITER+V1 (#19822)
Signed-off-by:
Qiang Li
<
qiang.li2@amd.com
>
parent
e41bf15c
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
3 additions
and
11 deletions
+3
-11
vllm/model_executor/layers/layernorm.py
vllm/model_executor/layers/layernorm.py
+0
-1
vllm/v1/attention/backends/mla/rocm_aiter_mla.py
vllm/v1/attention/backends/mla/rocm_aiter_mla.py
+3
-10
No files found.
vllm/model_executor/layers/layernorm.py
View file @
e3a3e4db
...
...
@@ -45,7 +45,6 @@ def fused_add_rms_norm(
def
rocm_aiter_rms_norm
(
x
:
torch
.
Tensor
,
weight
:
torch
.
Tensor
,
variance_epsilon
:
float
)
->
torch
.
Tensor
:
import
aiter
as
rocm_aiter
if
x
.
dim
()
>
2
:
x_original_shape
=
x
.
shape
...
...
vllm/v1/attention/backends/mla/rocm_aiter_mla.py
View file @
e3a3e4db
...
...
@@ -201,16 +201,9 @@ class AiterMLAImpl(MLACommonImpl[AiterMLAMetadata]):
kv_buffer
=
kv_c_and_k_pe_cache
.
unsqueeze
(
2
)
if
self
.
num_heads
==
16
:
# AITER MLA decode kernel only supports
# max_seqlen_q=1 when using 16 heads.
# max_seqlen_qo must be 1 except for MTP
# TODO: Find the best value for MTP
max_seqlen_qo
=
1
else
:
# AITER MLA decode Kernel handles arbitrary
# max_seqlen_q values when using 128 heads.
assert
attn_metadata
.
prefill
is
not
None
max_seqlen_qo
=
attn_metadata
.
prefill
.
max_query_len
aiter_mla_decode_fwd
(
q
,
kv_buffer
,
o
,
self
.
scale
,
attn_metadata
.
decode
.
qo_indptr
,
max_seqlen_qo
,
attn_metadata
.
decode
.
paged_kv_indptr
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment