Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
24e67fbf
Commit
24e67fbf
authored
Mar 25, 2019
by
Matthew Carrigan
Browse files
Minor README update
parent
8d1d1ffd
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
3 deletions
+6
-3
examples/lm_finetuning/README.md
examples/lm_finetuning/README.md
+6
-3
No files found.
examples/lm_finetuning/README.md
View file @
24e67fbf
...
@@ -58,9 +58,12 @@ recent GPUs. `--max_seq_len` defaults to 128 but can be set as high as 512.
...
@@ -58,9 +58,12 @@ recent GPUs. `--max_seq_len` defaults to 128 but can be set as high as 512.
Higher values may yield stronger language models at the cost of slower and more memory-intensive training
Higher values may yield stronger language models at the cost of slower and more memory-intensive training
In addition, if memory usage is an issue, especially when training on a single GPU, reducing
`--train_batch_size`
from
In addition, if memory usage is an issue, especially when training on a single GPU, reducing
`--train_batch_size`
from
the default 32 to a lower number (4-16) can be helpful. There is also a
`--reduce_memory`
option for both the
the default 32 to a lower number (4-16) can be helpful, or leaving
`--train_batch_size`
at the default and increasing
`pregenerate_training_data.py`
and
`finetune_on_pregenerated.py`
scripts that spills data to disc in shelf objects
`--gradient_accumulation_steps`
to 2-8. Changing
`--gradient_accumulation_steps`
may be preferable as alterations to the
or numpy memmaps rather than retaining it in memory, which hugely reduces memory usage with little performance impact.
batch size may require corresponding changes in the learning rate to compensate. There is also a
`--reduce_memory`
option for both the
`pregenerate_training_data.py`
and
`finetune_on_pregenerated.py`
scripts that spills data to disc
in shelf objects or numpy memmaps rather than retaining it in memory, which hugely reduces memory usage with little
performance impact.
###Examples
###Examples
#####Simple fine-tuning
#####Simple fine-tuning
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment