"examples/vscode:/vscode.git/clone" did not exist on "ef3dceff4a043418171d7f9f71bf2f046368de5b"
Unverified Commit 39117744 authored by Yih-Dar's avatar Yih-Dar Committed by GitHub
Browse files

Avoid all-zeor attnetion mask used in testing (#26469)



fix
Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
parent 9b23d0de
...@@ -2960,7 +2960,8 @@ def ids_tensor(shape, vocab_size, rng=None, name=None): ...@@ -2960,7 +2960,8 @@ def ids_tensor(shape, vocab_size, rng=None, name=None):
def random_attention_mask(shape, rng=None, name=None): def random_attention_mask(shape, rng=None, name=None):
attn_mask = ids_tensor(shape, vocab_size=2, rng=None, name=None) attn_mask = ids_tensor(shape, vocab_size=2, rng=None, name=None)
# make sure that at least one token is attended to for each batch # make sure that at least one token is attended to for each batch
attn_mask[:, -1] = 1 # we choose the 1st token so this property of `at least one being non-zero` still holds after applying causal mask
attn_mask[:, 0] = 1
return attn_mask return attn_mask
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment