[Example] Update README of transformers for multi-gpu support (#1435)

* Update README.md * Update README.md

[Example] Update README of transformers for multi-gpu support (#1435)
* Update README.md * Update README.md
88c34487 · Mufei Li · GitHub · e317f715 · 88c34487
Unverified Commit 88c34487 authored Apr 09, 2020 by Mufei Li Committed by GitHub Apr 09, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 4 additions and 12 deletions

examples/pytorch/transformer/README.md examples/pytorch/transformer/README.md +4 -12

No files found.
--- a/examples/pytorch/transformer/README.md
+++ b/examples/pytorch/transformer/README.md
 # Transformer in DGL
-In this example we implement the [Transformer](https://arxiv.org/pdf/1706.03762.pdf) and [Universal Transformer](https://arxiv.org/abs/1807.03819) with ACT in DGL.
+In this example we implement the [Transformer](https://arxiv.org/pdf/1706.03762.pdf) with ACT in DGL.

-The folder contains training module and inferencing module (beam decoder) for Transformer and training module for Universal Transformer
+The folder contains training module and inferencing module (beam decoder) for Transformer.

 ## Dependencies

@@ -18,6 +18,8 @@ The folder contains training module and inferencing module (beam decoder) for Tr
    python3 translation_train.py [--gpus id1,id2,...] [--N #layers] [--dataset DATASET] [--batch BATCHSIZE] [--universal]
    ```

+By specifying multiple gpu ids separated by comma, we will employ multi-gpu training with multiprocessing.
+
 - For evaluating BLEU score on test set(by enabling `--print` to see translated text):

    ```
@@ -28,19 +30,9 @@ Available datasets: `copy`, `sort`, `wmt14`, `multi30k`(default).

 ## Test Results

-### Transformer
-
 - Multi30k: we achieve BLEU score 35.41 with default setting on Multi30k dataset, without using pre-trained embeddings. (if we set the number of layers to 2, the BLEU score could reach 36.45).
 - WMT14: work in progress 

-### Universal Transformer
-
- work in progress 
-
-## Notes
-
- Currently we do not support Multi-GPU training(this will be fixed soon), you should only specify only one gpu\_id when running the training script.
-
 ## Reference

 - [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html)