Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
0a9860da
"vscode:/vscode.git/clone" did not exist on "4fc63151af58a05e8209431cca32d201440fbfa3"
Commit
0a9860da
authored
Feb 11, 2019
by
thomwolf
Browse files
tests pass on python 2 and 3
parent
2071a9b8
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
3 deletions
+2
-3
tests/tokenization_openai_test.py
tests/tokenization_openai_test.py
+2
-3
No files found.
tests/tokenization_openai_test.py
View file @
0a9860da
...
...
@@ -17,7 +17,6 @@ from __future__ import absolute_import, division, print_function, unicode_litera
import
os
import
unittest
import
json
from
io
import
open
from
pytorch_pretrained_bert.tokenization_openai
import
OpenAIGPTTokenizer
...
...
@@ -32,10 +31,10 @@ class OpenAIGPTTokenizationTest(unittest.TestCase):
"low</w>"
,
"lowest</w>"
,
"newer</w>"
,
"wider</w>"
]
vocab_tokens
=
dict
(
zip
(
vocab
,
range
(
len
(
vocab
))))
merges
=
[
"#version: 0.2"
,
"l o"
,
"lo w"
,
"e r</w>"
,
""
]
with
open
(
"/tmp/openai_tokenizer_vocab_test.json"
,
"w
b
"
)
as
fp
:
with
open
(
"/tmp/openai_tokenizer_vocab_test.json"
,
"w"
)
as
fp
:
json
.
dump
(
vocab_tokens
,
fp
)
vocab_file
=
fp
.
name
with
open
(
"/tmp/openai_tokenizer_merges_test.txt"
,
"w"
,
encoding
=
'utf-8'
)
as
fp
:
with
open
(
"/tmp/openai_tokenizer_merges_test.txt"
,
"w"
)
as
fp
:
fp
.
write
(
"
\n
"
.
join
(
merges
))
merges_file
=
fp
.
name
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment