Unverified Commit 85aff45e authored by Benjamin Chislett's avatar Benjamin Chislett Committed by GitHub
Browse files

[Perf] Remove blocking copy in GDN Attention (#31167)


Signed-off-by: default avatarBenjamin Chislett <bchislett@nvidia.com>
parent 5312a728
......@@ -143,7 +143,7 @@ class GDNAttentionMetadataBuilder(AttentionMetadataBuilder[GDNAttentionMetadata]
query_start_loc = m.query_start_loc
context_lens = m.num_computed_tokens_cpu
context_lens_tensor = context_lens.to(query_start_loc.device)
context_lens_tensor = context_lens.to(query_start_loc.device, non_blocking=True)
nums_dict, batch_ptr, token_chunk_offset_ptr = None, None, None
if (
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment