Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
57eb1cb6
Unverified
Commit
57eb1cb6
authored
Aug 03, 2020
by
Sam Shleifer
Committed by
GitHub
Aug 03, 2020
Browse files
[s2s] Document better mbart finetuning command (#6229)
* Document better MT command * improve multigpu command
parent
0513f8d2
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
5 additions
and
7 deletions
+5
-7
examples/seq2seq/README.md
examples/seq2seq/README.md
+4
-6
examples/seq2seq/train_mbart_cc25_enro.sh
examples/seq2seq/train_mbart_cc25_enro.sh
+1
-1
No files found.
examples/seq2seq/README.md
View file @
57eb1cb6
...
...
@@ -113,22 +113,20 @@ Best performing command:
# optionally
export
ENRO_DIR
=
'wmt_en_ro'
# Download instructions above
# export WANDB_PROJECT="MT" # optional
export
MAX_LEN
=
200
export
MAX_LEN
=
128
export
BS
=
4
export
GAS
=
8
# gradient accumulation steps
./train_mbart_cc25_enro.sh
--output_dir
enro_finetune_baseline
--label_smoothing
0.1
--fp16_opt_level
=
O1
--logger_name
wandb
--sortish_sampler
```
This should take < 6h/epoch on a 16GB v100 and achieve
val_avg_ BLEU score above 25. (you can see metrics in wandb or metrics.json).
To get results in line with fairseq, you need to do some postprocessing.
This should take < 6h/epoch on a 16GB v100 and achieve
test BLEU above 26
To get results in line with fairseq, you need to do some postprocessing.
(see
`romanian_postprocessing.md`
)
MultiGPU command
(using 8 GPUS as an example)
```
bash
export
ENRO_DIR
=
'wmt_en_ro'
# Download instructions above
# export WANDB_PROJECT="MT" # optional
export
MAX_LEN
=
200
export
MAX_LEN
=
128
export
BS
=
4
export
GAS
=
1
# gradient accumulation steps
./train_mbart_cc25_enro.sh
--output_dir
enro_finetune_baseline
--gpus
8
--logger_name
wandb
```
### Finetuning Outputs
...
...
examples/seq2seq/train_mbart_cc25_enro.sh
View file @
57eb1cb6
...
...
@@ -10,7 +10,7 @@ python finetune.py \
--num_train_epochs
6
--src_lang
en_XX
--tgt_lang
ro_RO
\
--data_dir
$ENRO_DIR
\
--max_source_length
$MAX_LEN
--max_target_length
$MAX_LEN
--val_max_target_length
$MAX_LEN
--test_max_target_length
$MAX_LEN
\
--train_batch_size
=
$BS
--eval_batch_size
=
$BS
--gradient_accumulation_steps
=
$GAS
\
--train_batch_size
=
$BS
--eval_batch_size
=
$BS
\
--task
translation
\
--warmup_steps
500
\
--freeze_embeds
\
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment