Avoid all-zeor attnetion mask used in testing (#26469)

fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Avoid all-zeor attnetion mask used in testing (#26469)
fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
39117744 · Yih-Dar · GitHub · 9b23d0de · 39117744
Unverified Commit 39117744 authored Sep 29, 2023 by Yih-Dar Committed by GitHub Sep 29, 2023
Show whitespace changes
Inline Side-by-side

Showing with 2 additions and 1 deletion

tests/test_modeling_common.py tests/test_modeling_common.py +2 -1

No files found.
--- a/tests/test_modeling_common.py
+++ b/tests/test_modeling_common.py
@@ -2960,7 +2960,8 @@ def ids_tensor(shape, vocab_size, rng=None, name=None):
 def random_attention_mask(shape, rng=None, name=None):
    attn_mask = ids_tensor(shape, vocab_size=2, rng=None, name=None)
    # make sure that at least one token is attended to for each batch
-    attn_mask[:, -1] = 1
+    # we choose the 1st token so this property of `at least one being non-zero` still holds after applying causal mask
+    attn_mask[:, 0] = 1
    return attn_mask