Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
30579e0f
Commit
30579e0f
authored
Mar 07, 2020
by
Sergey Mironov
Browse files
Update tokenizer: do the safety check before inserting EOL
parent
e0eaa1ed
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
0 deletions
+2
-0
official/nlp/transformer/utils/tokenizer.py
official/nlp/transformer/utils/tokenizer.py
+2
-0
No files found.
official/nlp/transformer/utils/tokenizer.py
View file @
30579e0f
...
@@ -140,6 +140,8 @@ class Subtokenizer(object):
...
@@ -140,6 +140,8 @@ class Subtokenizer(object):
for
token
in
tokens
:
for
token
in
tokens
:
ret
.
extend
(
self
.
_token_to_subtoken_ids
(
token
))
ret
.
extend
(
self
.
_token_to_subtoken_ids
(
token
))
if
add_eos
:
if
add_eos
:
assert
EOS
in
self
.
subtoken_list
,
\
"Can't append 'EOS' because it is not in list of known subtokens."
ret
.
append
(
EOS_ID
)
ret
.
append
(
EOS_ID
)
return
ret
return
ret
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment