Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
3b9a1dc1
Unverified
Commit
3b9a1dc1
authored
Feb 06, 2023
by
Stas Bekman
Committed by
GitHub
Feb 06, 2023
Browse files
[examples] improve block_size warning message (#21463)
parent
4435c7f5
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
15 additions
and
10 deletions
+15
-10
examples/pytorch/language-modeling/run_clm.py
examples/pytorch/language-modeling/run_clm.py
+3
-2
examples/pytorch/language-modeling/run_clm_no_trainer.py
examples/pytorch/language-modeling/run_clm_no_trainer.py
+3
-2
examples/pytorch/language-modeling/run_mlm.py
examples/pytorch/language-modeling/run_mlm.py
+3
-2
examples/pytorch/language-modeling/run_mlm_no_trainer.py
examples/pytorch/language-modeling/run_mlm_no_trainer.py
+3
-2
examples/pytorch/multiple-choice/run_swag.py
examples/pytorch/multiple-choice/run_swag.py
+3
-2
No files found.
examples/pytorch/language-modeling/run_clm.py
View file @
3b9a1dc1
...
...
@@ -459,8 +459,9 @@ def main():
block_size
=
tokenizer
.
model_max_length
if
block_size
>
1024
:
logger
.
warning
(
f
"The tokenizer picked seems to have a very large `model_max_length` (
{
tokenizer
.
model_max_length
}
). "
"Picking 1024 instead. You can change that default value by passing --block_size xxx."
"The chosen tokenizer supports a `model_max_length` that is longer than the default `block_size` value"
" of 1024. If you would like to use a longer `block_size` up to `tokenizer.model_max_length` you can"
" override this default with `--block_size xxx`."
)
block_size
=
1024
else
:
...
...
examples/pytorch/language-modeling/run_clm_no_trainer.py
View file @
3b9a1dc1
...
...
@@ -407,8 +407,9 @@ def main():
block_size
=
tokenizer
.
model_max_length
if
block_size
>
1024
:
logger
.
warning
(
f
"The tokenizer picked seems to have a very large `model_max_length` (
{
tokenizer
.
model_max_length
}
). "
"Picking 1024 instead. You can change that default value by passing --block_size xxx."
"The chosen tokenizer supports a `model_max_length` that is longer than the default `block_size` value"
" of 1024. If you would like to use a longer `block_size` up to `tokenizer.model_max_length` you can"
" override this default with `--block_size xxx`."
)
block_size
=
1024
else
:
...
...
examples/pytorch/language-modeling/run_mlm.py
View file @
3b9a1dc1
...
...
@@ -414,8 +414,9 @@ def main():
max_seq_length
=
tokenizer
.
model_max_length
if
max_seq_length
>
1024
:
logger
.
warning
(
f
"The tokenizer picked seems to have a very large `model_max_length` (
{
tokenizer
.
model_max_length
}
). "
"Picking 1024 instead. You can change that default value by passing --max_seq_length xxx."
"The chosen tokenizer supports a `model_max_length` that is longer than the default `block_size` value"
" of 1024. If you would like to use a longer `block_size` up to `tokenizer.model_max_length` you can"
" override this default with `--block_size xxx`."
)
max_seq_length
=
1024
else
:
...
...
examples/pytorch/language-modeling/run_mlm_no_trainer.py
View file @
3b9a1dc1
...
...
@@ -399,8 +399,9 @@ def main():
max_seq_length
=
tokenizer
.
model_max_length
if
max_seq_length
>
1024
:
logger
.
warning
(
f
"The tokenizer picked seems to have a very large `model_max_length` (
{
tokenizer
.
model_max_length
}
). "
"Picking 1024 instead. You can change that default value by passing --max_seq_length xxx."
"The chosen tokenizer supports a `model_max_length` that is longer than the default `block_size` value"
" of 1024. If you would like to use a longer `block_size` up to `tokenizer.model_max_length` you can"
" override this default with `--block_size xxx`."
)
max_seq_length
=
1024
else
:
...
...
examples/pytorch/multiple-choice/run_swag.py
View file @
3b9a1dc1
...
...
@@ -336,8 +336,9 @@ def main():
max_seq_length
=
tokenizer
.
model_max_length
if
max_seq_length
>
1024
:
logger
.
warning
(
f
"The tokenizer picked seems to have a very large `model_max_length` (
{
tokenizer
.
model_max_length
}
). "
"Picking 1024 instead. You can change that default value by passing --max_seq_length xxx."
"The chosen tokenizer supports a `model_max_length` that is longer than the default `block_size` value"
" of 1024. If you would like to use a longer `block_size` up to `tokenizer.model_max_length` you can"
" override this default with `--block_size xxx`."
)
max_seq_length
=
1024
else
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment