Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
2eaa8b6e
"tests/vscode:/vscode.git/clone" did not exist on "1a8843f93ec88585df18c895f0ec3d0914df8d10"
Commit
2eaa8b6e
authored
Jan 18, 2020
by
Julien Chaumond
Browse files
Easier to not support this, as it could be confusing
cc @lysandrejik
parent
801aaa55
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
10 deletions
+3
-10
examples/run_lm_finetuning.py
examples/run_lm_finetuning.py
+3
-10
No files found.
examples/run_lm_finetuning.py
View file @
2eaa8b6e
...
@@ -486,12 +486,6 @@ def main():
...
@@ -486,12 +486,6 @@ def main():
type
=
str
,
type
=
str
,
help
=
"Optional pretrained tokenizer name or path if not the same as model_name_or_path. If both are None, initialize a new tokenizer."
,
help
=
"Optional pretrained tokenizer name or path if not the same as model_name_or_path. If both are None, initialize a new tokenizer."
,
)
)
parser
.
add_argument
(
"--tokenizer_init_args"
,
default
=
""
,
type
=
str
,
help
=
"If instantiating a new tokenizer, comma-separated list of input args to feed the constructor."
,
)
parser
.
add_argument
(
parser
.
add_argument
(
"--cache_dir"
,
"--cache_dir"
,
default
=
None
,
default
=
None
,
...
@@ -661,11 +655,10 @@ def main():
...
@@ -661,11 +655,10 @@ def main():
elif
args
.
model_name_or_path
:
elif
args
.
model_name_or_path
:
tokenizer
=
tokenizer_class
.
from_pretrained
(
args
.
model_name_or_path
,
cache_dir
=
args
.
cache_dir
)
tokenizer
=
tokenizer_class
.
from_pretrained
(
args
.
model_name_or_path
,
cache_dir
=
args
.
cache_dir
)
else
:
else
:
logger
.
warning
(
raise
ValueError
(
"You are instantiating a new {} tokenizer
from scratch. Are you sure this is what you meant to do?
"
"You are instantiating a new {} tokenizer
. This is not supported, but you can do it from another script, save it,
"
"
To specifiy a pretrained tokenizer nam
e, us
e
--tokenizer_name"
.
format
(
tokenizer_class
.
__name__
)
"
and load it from her
e, us
ing
--tokenizer_name"
.
format
(
tokenizer_class
.
__name__
)
)
)
tokenizer
=
tokenizer_class
(
*
args
.
tokenizer_init_args
.
split
(
","
))
if
args
.
block_size
<=
0
:
if
args
.
block_size
<=
0
:
args
.
block_size
=
tokenizer
.
max_len_single_sentence
args
.
block_size
=
tokenizer
.
max_len_single_sentence
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment