Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
6060b2f8
Unverified
Commit
6060b2f8
authored
Aug 30, 2019
by
ziliwang
Committed by
GitHub
Aug 30, 2019
Browse files
fix: hard coding for max number
fp16 max number is 65504, the original 1e30 will cause Nan in fp16
parent
caf1d116
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
pytorch_transformers/modeling_xlnet.py
pytorch_transformers/modeling_xlnet.py
+4
-1
No files found.
pytorch_transformers/modeling_xlnet.py
View file @
6060b2f8
...
@@ -418,6 +418,9 @@ class XLNetRelativeAttention(nn.Module):
...
@@ -418,6 +418,9 @@ class XLNetRelativeAttention(nn.Module):
attn_score
=
(
ac
+
bd
+
ef
)
*
self
.
scale
attn_score
=
(
ac
+
bd
+
ef
)
*
self
.
scale
if
attn_mask
is
not
None
:
if
attn_mask
is
not
None
:
# attn_score = attn_score * (1 - attn_mask) - 1e30 * attn_mask
# attn_score = attn_score * (1 - attn_mask) - 1e30 * attn_mask
if
attn_mask
.
dtype
==
torch
.
float16
:
attn_score
=
attn_score
-
65500
*
attn_mask
else
:
attn_score
=
attn_score
-
1e30
*
attn_mask
attn_score
=
attn_score
-
1e30
*
attn_mask
# attention probability
# attention probability
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment