Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
d9b28627
".github/vscode:/vscode.git/clone" did not exist on "5254220e7f92b5f889c93dadeaeffa687f1b6169"
Unverified
Commit
d9b28627
authored
May 11, 2021
by
Julien Plu
Committed by
GitHub
May 11, 2021
Browse files
Fix TF Roberta for mixed precision training (#11675)
parent
a135f595
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
1 deletion
+3
-1
src/transformers/models/roberta/modeling_tf_roberta.py
src/transformers/models/roberta/modeling_tf_roberta.py
+3
-1
No files found.
src/transformers/models/roberta/modeling_tf_roberta.py
View file @
d9b28627
...
...
@@ -541,7 +541,9 @@ class TFRobertaMainLayer(tf.keras.layers.Layer):
# Since we are adding it to the raw scores before the softmax, this is
# effectively the same as removing these entirely.
extended_attention_mask
=
tf
.
cast
(
extended_attention_mask
,
dtype
=
embedding_output
.
dtype
)
extended_attention_mask
=
tf
.
multiply
(
tf
.
subtract
(
1.0
,
extended_attention_mask
),
-
10000.0
)
one_cst
=
tf
.
constant
(
1.0
,
dtype
=
embedding_output
.
dtype
)
ten_thousand_cst
=
tf
.
constant
(
-
10000.0
,
dtype
=
embedding_output
.
dtype
)
extended_attention_mask
=
tf
.
multiply
(
tf
.
subtract
(
one_cst
,
extended_attention_mask
),
ten_thousand_cst
)
# Prepare head mask if needed
# 1.0 in head_mask indicate we keep the head
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment