Unverified Commit 0e922986 authored by Lucas Wilkinson's avatar Lucas Wilkinson Committed by GitHub
Browse files

[Misc] Delay deprecation of CommonAttentionMetadata properties (#33801)


Signed-off-by: default avatarLucas Wilkinson <lwilkins@redhat.com>
parent 87d9a261
......@@ -347,7 +347,7 @@ class CommonAttentionMetadata:
"""
Prefer using device seq_lens directly to avoid implicit H<>D sync.
If a CPU copy is needed, use `seq_lens.cpu()` instead.
Will be removed in a future release (v0.15.0)
Will be removed in a future release, please migrate as soon as possible.
"""
)
def seq_lens_cpu(self) -> torch.Tensor:
......@@ -361,7 +361,7 @@ class CommonAttentionMetadata:
Prefer using device seq_lens directly to avoid implicit H<>D sync which breaks full
async scheduling. If a CPU copy is needed, it can be derived from
query_start_loc_cpu and seq_lens.
Will be removed in a future release (v0.15.0)
Will be removed in a future release, please migrate as soon as possible.
"""
)
def num_computed_tokens_cpu(self) -> torch.Tensor:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment