Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
34a1a010
Commit
34a1a010
authored
Nov 09, 2018
by
thomwolf
Browse files
update code comment
parent
34bdc8b5
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
modeling.py
modeling.py
+2
-2
No files found.
modeling.py
View file @
34a1a010
...
@@ -337,8 +337,8 @@ class BertModel(nn.Module):
...
@@ -337,8 +337,8 @@ class BertModel(nn.Module):
token_type_ids
=
torch
.
zeros_like
(
input_ids
)
token_type_ids
=
torch
.
zeros_like
(
input_ids
)
# We create a 3D attention mask from a 2D tensor mask.
# We create a 3D attention mask from a 2D tensor mask.
# Sizes are [batch_size, 1, 1,
from
_seq_length]
# Sizes are [batch_size, 1, 1,
to
_seq_length]
# So we can broadcast to [batch_size, num_heads,
to
_seq_length,
from
_seq_length]
# So we can broadcast to [batch_size, num_heads,
from
_seq_length,
to
_seq_length]
# this attention mask is more simple than the triangular masking of causal attention
# this attention mask is more simple than the triangular masking of causal attention
# used in OpenAI GPT, we just need to prepare the broadcast dimension here.
# used in OpenAI GPT, we just need to prepare the broadcast dimension here.
extended_attention_mask
=
attention_mask
.
unsqueeze
(
1
).
unsqueeze
(
2
)
extended_attention_mask
=
attention_mask
.
unsqueeze
(
1
).
unsqueeze
(
2
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment