• Shashank Gupta's avatar
    Added data collator for permutation (XLNet) language modeling and related calls (#5522) · 3dcb748e
    Shashank Gupta authored
    * Added data collator for XLNet language modeling and related calls
    
    Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
    to generate necessary inputs for language modeling training with
    XLNetLMHeadModel. Also added related arguments, logic and calls in
    examples/language-modeling/run_language_modeling.py.
    
    Resolves: #4739, #2008 (partially)
    
    * Changed name to `DataCollatorForPermutationLanguageModeling`
    
    Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
    Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
    CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
    similar to `mems` for XLNet).
    Changed calls and imports appropriately.
    
    * Added detailed comments, changed variable names
    
    Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.
    
    * Added tests for new data collator
    
    Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.
    
    * Fixed styling issues
    3dcb748e
test_trainer.py 8.17 KB