
Here are the steps performed to generated the trained GNMT TF model.
Please run the code with python3 since the vocab file is different in python2.

Step 1:
    Download the GNMT training code from the TF GNMT repo.
    $ Git clone https://github.com/tensorflow/nmt.git
    (commit b278487980832417ad8ac701c672b5c3dc7fa553)

Step 2:
    Change beam size to 5 in nmt/nmt/standard_hparams/wmt16_gnmt_4_layer.json
    to match the MLPerf GNMT training code

Step 3:
    Download and verify data from the training code:
    https://github.com/mlperf/training/tree/master/rnn_translator
    (commit 7788f8d7c13dbacaf4eb33e3fecddef6d8220257)

Step 4:
    duplicate the vocab data in data/ directory
    $ cp vocab.bpe.32000 vocab.bpe.32000.en
    $ cp vocab.bpe.32000 vocab.bpe.32000.de

Step 5:
    Modify the training command to the following to match
    the data setup from MLPerf training
    $ python -m nmt.nmt \
    --src=en --tgt=de \
    --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
    --out_dir=ende_gnmt \
    --vocab_prefix=/path/to/training/rnn_translator/data/vocab.bpe.32000 \
    --train_prefix=/path/to/GitHub/training/rnn_translator/data/train.tok.clean.bpe.32000 \
    --dev_prefix=/path/to/GitHub/training/rnn_translator/data/newstest_dev.tok.clean \
    --test_prefix=/path/to/GitHub/training/rnn_translator/data/newstest2014.tok.bpe.32000 \

Step 6:
    The training script will show the best checkpoint iteration at the command
    line. Rename the best model files with examples below:
    $ mv translate.ckpt-254000.data-00000-of-00001 translate.ckpt.data-00000-of-00001
    $ mv translate.ckpt-254000.index translate.ckpt.index
    $ mv translate.ckpt-254000.meta translate.ckpt.meta

