readme in transformer-xl example

8f1f2ca5 · Rick Ho · 59b27103 · 8f1f2ca5
Commit 8f1f2ca5 authored Feb 05, 2021 by Rick Ho
Show whitespace changes
Inline Side-by-side

Showing with 8 additions and 0 deletions

examples/transformer-xl/README.md examples/transformer-xl/README.md +8 -0

No files found.
--- a/examples/transformer-xl/README.md
+++ b/examples/transformer-xl/README.md
+This directory contains an example based on Zihang Dai, et.al's open-source
+transformer [implementation](https://github.com/kimiyoung/transformer-xl) to
+demostrate the usage of the usage of Fast MoE's layers.
+
+The code is released with Apache-2.0 license. Here, only the pytorch part of the
+code is used, with modification in the `mem_transformer.py` file to enable MoE
+training.
+
 ## Introduction

 This directory contains our pytorch implementation of Transformer-XL. Note that our state-of-the-art results reported in the paper were obtained by training the model on a large-scale TPU cluster, and our pytorch codebase currently does not support distributed training. Here we provide two sets of hyperparameters and scripts: