Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
3f1b0373
Unverified
Commit
3f1b0373
authored
Dec 04, 2025
by
TJian
Committed by
GitHub
Dec 04, 2025
Browse files
[ROCm] [Bugfix] `compute_attn_mask_seqlen` for qwen3 omni (#29974)
Signed-off-by:
tjtanaa
<
tunjian.tan@embeddedllm.com
>
parent
9aa33a74
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/model_executor/models/qwen3_omni_moe_thinker.py
vllm/model_executor/models/qwen3_omni_moe_thinker.py
+4
-1
No files found.
vllm/model_executor/models/qwen3_omni_moe_thinker.py
View file @
3f1b0373
...
@@ -494,7 +494,10 @@ class Qwen3Omni_VisionTransformer(nn.Module):
...
@@ -494,7 +494,10 @@ class Qwen3Omni_VisionTransformer(nn.Module):
cu_seqlens
:
torch
.
Tensor
,
cu_seqlens
:
torch
.
Tensor
,
)
->
torch
.
Tensor
:
)
->
torch
.
Tensor
:
max_seqlen
=
torch
.
zeros
([],
device
=
cu_seqlens
.
device
)
max_seqlen
=
torch
.
zeros
([],
device
=
cu_seqlens
.
device
)
if
self
.
attn_backend
==
AttentionBackendEnum
.
FLASH_ATTN
:
if
self
.
attn_backend
in
{
AttentionBackendEnum
.
FLASH_ATTN
,
AttentionBackendEnum
.
ROCM_AITER_FA
,
}:
max_seqlen
=
(
cu_seqlens
[
1
:]
-
cu_seqlens
[:
-
1
]).
max
()
max_seqlen
=
(
cu_seqlens
[
1
:]
-
cu_seqlens
[:
-
1
]).
max
()
return
max_seqlen
return
max_seqlen
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment