Removing spurious references to Penn Tree Bank results

b5465fe6 · Michael Carilli · def8fb85 · b5465fe6
Commit b5465fe6 authored Jan 31, 2019 by Michael Carilli
Hide whitespace changes
Inline Side-by-side

Showing with 7 additions and 11 deletions

examples/word_language_model/README.md examples/word_language_model/README.md +7 -11

No files found.
--- a/examples/word_language_model/README.md
+++ b/examples/word_language_model/README.md
@@ -10,9 +10,9 @@ The trained model can then be used by the generate script to generate new text.
 `main_fp16_optimizer.py` with `--fp16` demonstrates use of `apex.fp16_utils.FP16_Optimizer` to automatically manage master parameters and loss scaling.
 ```bash
-python main.py --cuda --epochs 6        # Train a LSTM on Wikitext-2 with CUDA, reaching perplexity of 117.61
+python main.py --cuda --epochs 6        # Train a LSTM on Wikitext-2 with CUDA
-python main.py --cuda --epochs 6 --tied # Train a tied LSTM on Wikitext-2 with CUDA, reaching perplexity of 110.44
+python main.py --cuda --epochs 6 --tied # Train a tied LSTM on Wikitext-2 with CUDA
-python main.py --cuda --tied            # Train a tied LSTM on Wikitext-2 with CUDA for 40 epochs, reaching perplexity of 87.17
+python main.py --cuda --tied            # Train a tied LSTM on Wikitext-2 with CUDA for 40 epochs
 python generate.py                      # Generate samples from the trained LSTM model.
 ```
@@ -71,12 +71,8 @@ With these arguments, a variety of models can be tested.
 As an example, the following arguments produce slower but better models:
 ```bash
-python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40           # Test perplexity of 80.97
+python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40
-python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40 --tied    # Test perplexity of 75.96
+python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40 --tied
-python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40        # Test perplexity of 77.42
+python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40
-python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 --tied # Test perplexity of 72.30
+python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 --tied
 ```
-Perplexities on PTB are equal or better than
-[Recurrent Neural Network Regularization (Zaremba et al. 2014)](https://arxiv.org/pdf/1409.2329.pdf)
-and are similar to [Using the Output Embedding to Improve Language Models (Press & Wolf 2016](https://arxiv.org/abs/1608.05859) and [Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling (Inan et al. 2016)](https://arxiv.org/pdf/1611.01462.pdf), though both of these papers have improved perplexities by using a form of recurrent dropout [(variational dropout)](http://papers.nips.cc/paper/6241-a-theoretically-grounded-application-of-dropout-in-recurrent-neural-networks).