1. 01 Jun, 2020 2 commits
  2. 29 May, 2020 2 commits
  3. 19 May, 2020 1 commit
    • Julien Chaumond's avatar
      Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300) · 4c068936
      Julien Chaumond authored
      * Test case for #3936
      
      * multigpu tests pass on pytorch 1.4.0
      
      * Fixup
      
      * multigpu tests pass on pytorch 1.5.0
      
      * Update src/transformers/modeling_utils.py
      
      * Update src/transformers/modeling_utils.py
      
      * rename multigpu to require_multigpu
      
      * mode doc
      4c068936
  4. 30 Apr, 2020 1 commit
  5. 29 Apr, 2020 1 commit
    • Julien Chaumond's avatar
      CDN urls (#4030) · 455c6390
      Julien Chaumond authored
      * [file_utils] use_cdn + documentation
      
      * Move to cdn. urls for weights
      
      * [urls] Hotfix for bert-base-japanese
      455c6390
  6. 28 Apr, 2020 1 commit
    • Patrick von Platen's avatar
      Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility (#3383) · fa49b9af
      Patrick von Platen authored
      * change encoder decoder style to bart & t5 style
      
      * make encoder decoder generation dummy work for bert
      
      * make style
      
      * clean init config in encoder decoder
      
      * add tests for encoder decoder models
      
      * refactor and add last tests
      
      * refactor and add last tests
      
      * fix attn masks for bert encoder decoder
      
      * make style
      
      * refactor prepare inputs for Bert
      
      * refactor
      
      * finish encoder decoder
      
      * correct typo
      
      * add docstring to config
      
      * finish
      
      * add tests
      
      * better naming
      
      * make style
      
      * fix flake8
      
      * clean docstring
      
      * make style
      
      * rename
      fa49b9af
  7. 23 Apr, 2020 1 commit
  8. 21 Apr, 2020 1 commit
  9. 17 Apr, 2020 1 commit
  10. 16 Apr, 2020 2 commits
  11. 03 Apr, 2020 1 commit
    • Lysandre Debut's avatar
      ELECTRA (#3257) · d5d7d886
      Lysandre Debut authored
      * Electra wip
      
      * helpers
      
      * Electra wip
      
      * Electra v1
      
      * ELECTRA may be saved/loaded
      
      * Generator & Discriminator
      
      * Embedding size instead of halving the hidden size
      
      * ELECTRA Tokenizer
      
      * Revert BERT helpers
      
      * ELECTRA Conversion script
      
      * Archive maps
      
      * PyTorch tests
      
      * Start fixing tests
      
      * Tests pass
      
      * Same configuration for both models
      
      * Compatible with base + large
      
      * Simplification + weight tying
      
      * Archives
      
      * Auto + Renaming to standard names
      
      * ELECTRA is uncased
      
      * Tests
      
      * Slight API changes
      
      * Update tests
      
      * wip
      
      * ElectraForTokenClassification
      
      * temp
      
      * Simpler arch + tests
      
      Removed ElectraForPreTraining which will be in a script
      
      * Conversion script
      
      * Auto model
      
      * Update links to S3
      
      * Split ElectraForPreTraining and ElectraForTokenClassification
      
      * Actually test PreTraining model
      
      * Remove num_labels from configuration
      
      * wip
      
      * wip
      
      * From discriminator and generator to electra
      
      * Slight API changes
      
      * Better naming
      
      * TensorFlow ELECTRA tests
      
      * Accurate conversion script
      
      * Added to conversion script
      
      * Fast ELECTRA tokenizer
      
      * Style
      
      * Add ELECTRA to README
      
      * Modeling Pytorch Doc + Real style
      
      * TF Docs
      
      * Docs
      
      * Correct links
      
      * Correct model intialized
      
      * random fixes
      
      * style
      
      * Addressing Patrick's and Sam's comments
      
      * Correct links in docs
      d5d7d886
  12. 01 Apr, 2020 1 commit
  13. 25 Feb, 2020 2 commits
    • Lysandre Debut's avatar
      Documentation (#2989) · bb7c4685
      Lysandre Debut authored
      * All Tokenizers
      
      BertTokenizer + few fixes
      RobertaTokenizer
      OpenAIGPTTokenizer + Fixes
      GPT2Tokenizer + fixes
      TransfoXLTokenizer
      Correct rst for TransformerXL
      XLMTokenizer + fixes
      XLNet Tokenizer + Style
      DistilBERT + Fix XLNet RST
      CTRLTokenizer
      CamemBERT Tokenizer
      FlaubertTokenizer
      XLMRobertaTokenizer
      cleanup
      
      * cleanup
      bb7c4685
    • srush's avatar
      Change masking to direct labeling for TPU support. (#2982) · e8ce63ff
      srush authored
      * change masking to direct labelings
      
      * fix black
      
      * switch to ignore index
      
      * .
      
      * fix black
      e8ce63ff
  14. 21 Feb, 2020 1 commit
  15. 13 Feb, 2020 1 commit
  16. 11 Feb, 2020 1 commit
    • Oleksiy Syvokon's avatar
      BERT decoder: Fix causal mask dtype. · ee5de0ba
      Oleksiy Syvokon authored
      PyTorch < 1.3 requires multiplication operands to be of the same type.
      This was violated when using default attention mask (i.e.,
      attention_mask=None in arguments) given BERT in the decoder mode.
      
      In particular, this was breaking Model2Model and made tutorial
      from the quickstart failing.
      ee5de0ba
  17. 07 Feb, 2020 1 commit
  18. 04 Feb, 2020 1 commit
  19. 03 Feb, 2020 1 commit
    • Lysandre's avatar
      [Follow up 213] · 239dd23f
      Lysandre authored
      Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
      239dd23f
  20. 28 Jan, 2020 1 commit
  21. 23 Jan, 2020 7 commits
  22. 15 Jan, 2020 1 commit
  23. 14 Jan, 2020 1 commit
    • Lysandre's avatar
      Bias should be resized with the weights · 100e3b6f
      Lysandre authored
      Created a link between the linear layer bias and the model attribute bias. This does not change anything for the user nor for the conversion scripts, but allows the `resize_token_embeddings` method to resize the bias as well as the weights of the decoder.
      
      Added a test.
      100e3b6f
  24. 07 Jan, 2020 2 commits
  25. 06 Jan, 2020 3 commits
  26. 22 Dec, 2019 2 commits