fix AutoTP in deepspeed could not work for bloom (#22196)

* fix AutoTP in deepspeed could not work for bloom Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add a method in BloomModel to build ailib Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

fix AutoTP in deepspeed could not work for bloom (#22196)
* fix AutoTP in deepspeed could not work for bloom Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add a method in BloomModel to build ailib Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
675d2a5a · Wang, Yi · GitHub · 00934026 · 675d2a5a
Unverified Commit 675d2a5a authored Mar 17, 2023 by Wang, Yi Committed by GitHub Mar 17, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 4 additions and 1 deletion

src/transformers/models/bloom/modeling_bloom.py src/transformers/models/bloom/modeling_bloom.py +4 -1

No files found.
--- a/src/transformers/models/bloom/modeling_bloom.py
+++ b/src/transformers/models/bloom/modeling_bloom.py
@@ -641,6 +641,9 @@ class BloomModel(BloomPreTrainedModel):
        # Initialize weights and apply final processing
        self.post_init()
+    def build_alibi_tensor(self, attention_mask: torch.Tensor, num_heads: int, dtype: torch.dtype) -> torch.Tensor:
+        return build_alibi_tensor(attention_mask, num_heads, dtype)
    def get_input_embeddings(self):
        return self.word_embeddings
@@ -750,7 +753,7 @@ class BloomModel(BloomPreTrainedModel):
        else:
            attention_mask = attention_mask.to(hidden_states.device)
-        alibi = build_alibi_tensor(attention_mask, self.num_heads, dtype=hidden_states.dtype)
+        alibi = self.build_alibi_tensor(attention_mask, self.num_heads, dtype=hidden_states.dtype)
        causal_mask = self._prepare_attn_mask(
            attention_mask,