Unverified Commit 468e2400 authored by Lucas Wilkinson's avatar Lucas Wilkinson Committed by GitHub
Browse files

[BugFix][CPU] Fix `TorchSDPABackendImpl` doesn't have `use_irope` (#21200)


Signed-off-by: default avatarLucas Wilkinson <lwilkins@redhat.com>
parent dcc6cfb9
...@@ -2668,7 +2668,8 @@ class GPUModelRunner(LoRAModelRunnerMixin): ...@@ -2668,7 +2668,8 @@ class GPUModelRunner(LoRAModelRunnerMixin):
# TODO: Support other attention modules, e.g., cross-attention # TODO: Support other attention modules, e.g., cross-attention
if attn_module.attn_type == AttentionType.DECODER: if attn_module.attn_type == AttentionType.DECODER:
use_local_attention = (self.attention_chunk_size is not None use_local_attention = (self.attention_chunk_size is not None
and attn_module.impl.use_irope) and getattr(attn_module.impl,
"use_irope", False))
if attn_module.sliding_window is not None: if attn_module.sliding_window is not None:
kv_cache_spec[layer_name] = SlidingWindowSpec( kv_cache_spec[layer_name] = SlidingWindowSpec(
block_size=block_size, block_size=block_size,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment