XLNet PLM Readme (#6121)

641b873c · Lysandre Debut · GitHub · 8d157c93 · 641b873c
Unverified Commit 641b873c authored Jul 29, 2020 by Lysandre Debut Committed by GitHub Jul 29, 2020
Show whitespace changes
Inline Side-by-side

Showing with 24 additions and 0 deletions

examples/language-modeling/README.md examples/language-modeling/README.md +24 -0

No files found.
--- a/examples/language-modeling/README.md
+++ b/examples/language-modeling/README.md
@@ -60,3 +60,27 @@ python run_language_modeling.py \
    --mlm
 ```

+### XLNet and permutation language modeling
+
+XLNet uses a different training objective, which is permutation language modeling. It is an autoregressive method 
+to learn bidirectional contexts by maximizing the expected likelihood over all permutations of the input 
+sequence factorization order.
+
+We use the `--plm_probability` flag to define the ratio of length of a span of masked tokens to surrounding 
+context length for permutation language modeling.
+
+The `--max_span_length` flag may also be used to limit the length of a span of masked tokens used 
+for permutation language modeling.
+
+```bash
+export TRAIN_FILE=/path/to/dataset/wiki.train.raw
+export TEST_FILE=/path/to/dataset/wiki.test.raw
+
+python run_language_modeling.py \
+    --output_dir=output \
+    --model_name_or_path=xlnet-base-cased \
+    --do_train \
+    --train_data_file=$TRAIN_FILE \
+    --do_eval \
+    --eval_data_file=$TEST_FILE \
+```