Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
aa20d10a
Unverified
Commit
aa20d10a
authored
Jun 19, 2025
by
zsolt-borbely-htec
Committed by
GitHub
Jun 19, 2025
Browse files
[Misc] [ROCm] Prevent surplus tensor reshape (#19803)
Signed-off-by:
Zsolt Borbely
<
zsolt.borbely@htecgroup.com
>
parent
2de12be4
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
vllm/v1/attention/backends/triton_attn.py
vllm/v1/attention/backends/triton_attn.py
+1
-1
No files found.
vllm/v1/attention/backends/triton_attn.py
View file @
aa20d10a
...
@@ -376,7 +376,7 @@ class TritonAttentionImpl(AttentionImpl):
...
@@ -376,7 +376,7 @@ class TritonAttentionImpl(AttentionImpl):
query
.
reshape
(
query
.
reshape
(
(
num_tokens
,
num_heads
*
head_size
)).
contiguous
(),
(
num_tokens
,
num_heads
*
head_size
)).
contiguous
(),
layer
.
_q_scale
)
layer
.
_q_scale
)
query
=
query
.
reshape
((
num_tokens
,
num_heads
,
head_size
))
query
=
query
.
reshape
((
num_tokens
,
num_heads
,
head_size
))
use_local_attn
=
\
use_local_attn
=
\
(
self
.
use_irope
and
attn_metadata
.
local_attn_metadata
is
not
None
)
(
self
.
use_irope
and
attn_metadata
.
local_attn_metadata
is
not
None
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment