Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
d5faa74c
Commit
d5faa74c
authored
Nov 05, 2019
by
Julien Chaumond
Browse files
tokenizer white space: revert to previous behavior
parent
0b77d66a
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
examples/run_pplm.py
examples/run_pplm.py
+1
-1
No files found.
examples/run_pplm.py
View file @
d5faa74c
...
...
@@ -373,7 +373,7 @@ def get_bag_of_words_indices(bag_of_words_ids_or_paths: List[str]) -> List[List[
filepath
=
id_or_path
with
open
(
filepath
,
"r"
)
as
f
:
words
=
f
.
read
().
split
(
"
\n
"
)
bow_indices
.
append
([
TOKENIZER
.
encode
(
word
)
for
word
in
words
])
bow_indices
.
append
([
TOKENIZER
.
encode
(
word
,
add_prefix_space
=
True
)
for
word
in
words
])
return
bow_indices
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment