Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
85aff45e
Unverified
Commit
85aff45e
authored
Dec 22, 2025
by
Benjamin Chislett
Committed by
GitHub
Dec 22, 2025
Browse files
[Perf] Remove blocking copy in GDN Attention (#31167)
Signed-off-by:
Benjamin Chislett
<
bchislett@nvidia.com
>
parent
5312a728
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
vllm/v1/attention/backends/gdn_attn.py
vllm/v1/attention/backends/gdn_attn.py
+1
-1
No files found.
vllm/v1/attention/backends/gdn_attn.py
View file @
85aff45e
...
...
@@ -143,7 +143,7 @@ class GDNAttentionMetadataBuilder(AttentionMetadataBuilder[GDNAttentionMetadata]
query_start_loc
=
m
.
query_start_loc
context_lens
=
m
.
num_computed_tokens_cpu
context_lens_tensor
=
context_lens
.
to
(
query_start_loc
.
device
)
context_lens_tensor
=
context_lens
.
to
(
query_start_loc
.
device
,
non_blocking
=
True
)
nums_dict
,
batch_ptr
,
token_chunk_offset_ptr
=
None
,
None
,
None
if
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment