[PyTorch] forward attention_type in MultiHeadAttention (#621)

[PyTorch] fix forward attention_type in MultiheadAttention Signed-off-by: Markus Schnoes <markus.schnoes@gmx.de> Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

[PyTorch] forward attention_type in MultiHeadAttention (#621)
[PyTorch] fix forward attention_type in MultiheadAttention Signed-off-by: Markus Schnoes <markus.schnoes@gmx.de> Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
bea70f2e · Marks101 · GitHub · 4dc36f0e · bea70f2e
Unverified Commit bea70f2e authored Jan 24, 2024 by Marks101 Committed by GitHub Jan 24, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 0 deletions

transformer_engine/pytorch/attention.py transformer_engine/pytorch/attention.py +1 -0

No files found.
--- a/transformer_engine/pytorch/attention.py
+++ b/transformer_engine/pytorch/attention.py
@@ -3090,6 +3090,7 @@ class MultiheadAttention(torch.nn.Module):
            sequence_parallel=sequence_parallel,
            tp_group=tp_group,
            layer_number=self.layer_number,
+            attention_type=self.attention_type,
        )
        # Linear