Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
4a49c225
Commit
4a49c225
authored
Mar 05, 2019
by
Catalin Voss
Browse files
Warn instead of raising in BERT and GPT-2 tokenizers as well, to allow for pre-caching of tokens
parent
e99bc87e
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
2 additions
and
2 deletions
+2
-2
pytorch_pretrained_bert/tokenization.py
pytorch_pretrained_bert/tokenization.py
+1
-1
pytorch_pretrained_bert/tokenization_gpt2.py
pytorch_pretrained_bert/tokenization_gpt2.py
+1
-1
No files found.
pytorch_pretrained_bert/tokenization.py
View file @
4a49c225
...
...
@@ -101,7 +101,7 @@ class BertTokenizer(object):
for
token
in
tokens
:
ids
.
append
(
self
.
vocab
[
token
])
if
len
(
ids
)
>
self
.
max_len
:
raise
ValueError
(
logger
.
warning
(
"Token indices sequence length is longer than the specified maximum "
" sequence length for this BERT model ({} > {}). Running this"
" sequence through BERT will result in indexing errors"
.
format
(
len
(
ids
),
self
.
max_len
)
...
...
pytorch_pretrained_bert/tokenization_gpt2.py
View file @
4a49c225
...
...
@@ -193,7 +193,7 @@ class GPT2Tokenizer(object):
token
=
''
.
join
(
self
.
byte_encoder
[
b
]
for
b
in
token
.
encode
(
'utf-8'
))
bpe_tokens
.
extend
(
self
.
encoder
[
bpe_token
]
for
bpe_token
in
self
.
bpe
(
token
).
split
(
' '
))
if
len
(
bpe_tokens
)
>
self
.
max_len
:
raise
ValueError
(
logger
.
warning
(
"Token indices sequence length is longer than the specified maximum "
" sequence length for this OpenAI GPT-2 model ({} > {}). Running this"
" sequence through the model will result in indexing errors"
.
format
(
len
(
bpe_tokens
),
self
.
max_len
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment