Unverified Commit 88c34487 authored by Mufei Li's avatar Mufei Li Committed by GitHub
Browse files

[Example] Update README of transformers for multi-gpu support (#1435)

* Update README.md

* Update README.md
parent e317f715
# Transformer in DGL # Transformer in DGL
In this example we implement the [Transformer](https://arxiv.org/pdf/1706.03762.pdf) and [Universal Transformer](https://arxiv.org/abs/1807.03819) with ACT in DGL. In this example we implement the [Transformer](https://arxiv.org/pdf/1706.03762.pdf) with ACT in DGL.
The folder contains training module and inferencing module (beam decoder) for Transformer and training module for Universal Transformer The folder contains training module and inferencing module (beam decoder) for Transformer.
## Dependencies ## Dependencies
...@@ -18,6 +18,8 @@ The folder contains training module and inferencing module (beam decoder) for Tr ...@@ -18,6 +18,8 @@ The folder contains training module and inferencing module (beam decoder) for Tr
python3 translation_train.py [--gpus id1,id2,...] [--N #layers] [--dataset DATASET] [--batch BATCHSIZE] [--universal] python3 translation_train.py [--gpus id1,id2,...] [--N #layers] [--dataset DATASET] [--batch BATCHSIZE] [--universal]
``` ```
By specifying multiple gpu ids separated by comma, we will employ multi-gpu training with multiprocessing.
- For evaluating BLEU score on test set(by enabling `--print` to see translated text): - For evaluating BLEU score on test set(by enabling `--print` to see translated text):
``` ```
...@@ -28,19 +30,9 @@ Available datasets: `copy`, `sort`, `wmt14`, `multi30k`(default). ...@@ -28,19 +30,9 @@ Available datasets: `copy`, `sort`, `wmt14`, `multi30k`(default).
## Test Results ## Test Results
### Transformer
- Multi30k: we achieve BLEU score 35.41 with default setting on Multi30k dataset, without using pre-trained embeddings. (if we set the number of layers to 2, the BLEU score could reach 36.45). - Multi30k: we achieve BLEU score 35.41 with default setting on Multi30k dataset, without using pre-trained embeddings. (if we set the number of layers to 2, the BLEU score could reach 36.45).
- WMT14: work in progress - WMT14: work in progress
### Universal Transformer
- work in progress
## Notes
- Currently we do not support Multi-GPU training(this will be fixed soon), you should only specify only one gpu\_id when running the training script.
## Reference ## Reference
- [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html) - [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment