Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
391db836
Commit
391db836
authored
Oct 01, 2019
by
thomwolf
Browse files
fix #1260 - remove special logic for decoding pairs of sequence
parent
963529e2
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
13 deletions
+4
-13
transformers/tokenization_utils.py
transformers/tokenization_utils.py
+4
-13
No files found.
transformers/tokenization_utils.py
View file @
391db836
...
@@ -933,20 +933,11 @@ class PreTrainedTokenizer(object):
...
@@ -933,20 +933,11 @@ class PreTrainedTokenizer(object):
sub_texts
.
append
(
self
.
convert_tokens_to_string
(
current_sub_text
))
sub_texts
.
append
(
self
.
convert_tokens_to_string
(
current_sub_text
))
text
=
''
.
join
(
sub_texts
)
text
=
''
.
join
(
sub_texts
)
if
self
.
_sep_token
is
not
None
and
self
.
_sep_token
in
text
:
if
clean_up_tokenization_spaces
:
text
=
text
.
replace
(
self
.
_cls_token
,
self
.
_sep_token
)
clean_text
=
self
.
clean_up_tokenization
(
text
)
split_text
=
list
(
filter
(
lambda
sentence
:
len
(
sentence
)
>
0
,
text
.
split
(
self
.
_sep_token
)))
return
clean_text
if
clean_up_tokenization_spaces
:
clean_text
=
[
self
.
clean_up_tokenization
(
text
)
for
text
in
split_text
]
return
clean_text
else
:
return
split_text
else
:
else
:
if
clean_up_tokenization_spaces
:
return
text
clean_text
=
self
.
clean_up_tokenization
(
text
)
return
clean_text
else
:
return
text
@
property
@
property
def
special_tokens_map
(
self
):
def
special_tokens_map
(
self
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment