Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
a71c9248
Commit
a71c9248
authored
Oct 29, 2019
by
Allen Wang
Committed by
A. Unique TensorFlower
Oct 29, 2019
Browse files
Internal change
PiperOrigin-RevId: 277305436
parent
e37e8049
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
37 additions
and
0 deletions
+37
-0
official/transformer/v2/README.md
official/transformer/v2/README.md
+37
-0
No files found.
official/transformer/v2/README.md
View file @
a71c9248
...
...
@@ -131,6 +131,43 @@ tensorboard --logdir=$MODEL_DIR
- --num_gpus=2+: Uses tf.distribute.MirroredStrategy to run synchronous
distributed training across the GPUs.
#### Using TPUs
Note: This model will
**not**
work with TPUs on Colab.
You can train the Transformer model on Cloud TPUs using
`tf.distribute.TPUStrategy`
. If you are not familiar with Cloud TPUs, it is
strongly recommended that you go through the
[
quickstart
](
https://cloud.google.com/tpu/docs/quickstart
)
to learn how to
create a TPU and GCE VM.
To run the Transformer model on a TPU, you must set
`--distribution_strategy=tpu`
,
`--tpu=$TPU_NAME`
, and
`--use_ctl=True`
where
`$TPU_NAME`
the name of your TPU in the Cloud Console.
An example command to run Transformer on a v2-8 or v3-8 TPU would be:
```
bash
python transformer_main.py
\
--tpu
=
$TPU_NAME
\
--model_dir
=
$MODEL_DIR
\
--data_dir
=
$DATA_DIR
\
--vocab_file
=
$DATA_DIR
/vocab.ende.32768
\
--bleu_source
=
$DATA_DIR
/newstest2014.en
\
--bleu_ref
=
$DATA_DIR
/newstest2014.end
\
--batch_size
=
6144
\
--train_steps
=
2000
\
--static_batch
=
true
\
--use_ctl
=
true
\
--param_set
=
big
\
--max_length
=
64
\
--decode_batch_size
=
32
\
--decode_max_length
=
97
\
--padded_decode
=
true
\
--distribution_strategy
=
tpu
```
Note:
`$MODEL_DIR`
and
`$DATA_DIR`
must be GCS paths.
#### Customizing training schedule
By default, the model will train for 10 epochs, and evaluate after every epoch. The training schedule may be defined through the flags:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment