Unverified Commit bbdc0f23 authored by Rohan Potdar's avatar Rohan Potdar Committed by GitHub
Browse files

[ROCm][AITER][Bugfix] Switch AITER to use PIECEWISE_AND_FULL compilation (#25104)


Signed-off-by: default avatarRohan138 <rohanpotdar138@gmail.com>
parent dc340593
...@@ -232,7 +232,7 @@ class AiterFlashAttentionMetadata: ...@@ -232,7 +232,7 @@ class AiterFlashAttentionMetadata:
class AiterFlashAttentionMetadataBuilder( class AiterFlashAttentionMetadataBuilder(
AttentionMetadataBuilder[AiterFlashAttentionMetadata]): AttentionMetadataBuilder[AiterFlashAttentionMetadata]):
cudagraph_support = AttentionCGSupport.ALWAYS cudagraph_support = AttentionCGSupport.UNIFORM_SINGLE_TOKEN_DECODE
def __init__(self, kv_cache_spec: AttentionSpec, layer_names: list[str], def __init__(self, kv_cache_spec: AttentionSpec, layer_names: list[str],
vllm_config: VllmConfig, device: torch.device): vllm_config: VllmConfig, device: torch.device):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment