1. 26 Aug, 2020 1 commit
  2. 26 Jun, 2020 1 commit
  3. 02 Jun, 2020 1 commit
    • Julien Chaumond's avatar
      Kill model archive maps (#4636) · d4c2cb40
      Julien Chaumond authored
      * Kill model archive maps
      
      * Fixup
      
      * Also kill model_archive_map for MaskedBertPreTrainedModel
      
      * Unhook config_archive_map
      
      * Tokenizers: align with model id changes
      
      * make style && make quality
      
      * Fix CI
      d4c2cb40
  4. 01 Jun, 2020 3 commits
  5. 29 May, 2020 2 commits
  6. 19 May, 2020 1 commit
    • Julien Chaumond's avatar
      Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300) · 4c068936
      Julien Chaumond authored
      * Test case for #3936
      
      * multigpu tests pass on pytorch 1.4.0
      
      * Fixup
      
      * multigpu tests pass on pytorch 1.5.0
      
      * Update src/transformers/modeling_utils.py
      
      * Update src/transformers/modeling_utils.py
      
      * rename multigpu to require_multigpu
      
      * mode doc
      4c068936
  7. 30 Apr, 2020 1 commit
  8. 29 Apr, 2020 1 commit
    • Julien Chaumond's avatar
      CDN urls (#4030) · 455c6390
      Julien Chaumond authored
      * [file_utils] use_cdn + documentation
      
      * Move to cdn. urls for weights
      
      * [urls] Hotfix for bert-base-japanese
      455c6390
  9. 28 Apr, 2020 1 commit
    • Patrick von Platen's avatar
      Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility (#3383) · fa49b9af
      Patrick von Platen authored
      * change encoder decoder style to bart & t5 style
      
      * make encoder decoder generation dummy work for bert
      
      * make style
      
      * clean init config in encoder decoder
      
      * add tests for encoder decoder models
      
      * refactor and add last tests
      
      * refactor and add last tests
      
      * fix attn masks for bert encoder decoder
      
      * make style
      
      * refactor prepare inputs for Bert
      
      * refactor
      
      * finish encoder decoder
      
      * correct typo
      
      * add docstring to config
      
      * finish
      
      * add tests
      
      * better naming
      
      * make style
      
      * fix flake8
      
      * clean docstring
      
      * make style
      
      * rename
      fa49b9af
  10. 23 Apr, 2020 1 commit
  11. 21 Apr, 2020 1 commit
  12. 17 Apr, 2020 1 commit
  13. 16 Apr, 2020 2 commits
  14. 03 Apr, 2020 1 commit
    • Lysandre Debut's avatar
      ELECTRA (#3257) · d5d7d886
      Lysandre Debut authored
      * Electra wip
      
      * helpers
      
      * Electra wip
      
      * Electra v1
      
      * ELECTRA may be saved/loaded
      
      * Generator & Discriminator
      
      * Embedding size instead of halving the hidden size
      
      * ELECTRA Tokenizer
      
      * Revert BERT helpers
      
      * ELECTRA Conversion script
      
      * Archive maps
      
      * PyTorch tests
      
      * Start fixing tests
      
      * Tests pass
      
      * Same configuration for both models
      
      * Compatible with base + large
      
      * Simplification + weight tying
      
      * Archives
      
      * Auto + Renaming to standard names
      
      * ELECTRA is uncased
      
      * Tests
      
      * Slight API changes
      
      * Update tests
      
      * wip
      
      * ElectraForTokenClassification
      
      * temp
      
      * Simpler arch + tests
      
      Removed ElectraForPreTraining which will be in a script
      
      * Conversion script
      
      * Auto model
      
      * Update links to S3
      
      * Split ElectraForPreTraining and ElectraForTokenClassification
      
      * Actually test PreTraining model
      
      * Remove num_labels from configuration
      
      * wip
      
      * wip
      
      * From discriminator and generator to electra
      
      * Slight API changes
      
      * Better naming
      
      * TensorFlow ELECTRA tests
      
      * Accurate conversion script
      
      * Added to conversion script
      
      * Fast ELECTRA tokenizer
      
      * Style
      
      * Add ELECTRA to README
      
      * Modeling Pytorch Doc + Real style
      
      * TF Docs
      
      * Docs
      
      * Correct links
      
      * Correct model intialized
      
      * random fixes
      
      * style
      
      * Addressing Patrick's and Sam's comments
      
      * Correct links in docs
      d5d7d886
  15. 01 Apr, 2020 1 commit
  16. 25 Feb, 2020 2 commits
    • Lysandre Debut's avatar
      Documentation (#2989) · bb7c4685
      Lysandre Debut authored
      * All Tokenizers
      
      BertTokenizer + few fixes
      RobertaTokenizer
      OpenAIGPTTokenizer + Fixes
      GPT2Tokenizer + fixes
      TransfoXLTokenizer
      Correct rst for TransformerXL
      XLMTokenizer + fixes
      XLNet Tokenizer + Style
      DistilBERT + Fix XLNet RST
      CTRLTokenizer
      CamemBERT Tokenizer
      FlaubertTokenizer
      XLMRobertaTokenizer
      cleanup
      
      * cleanup
      bb7c4685
    • srush's avatar
      Change masking to direct labeling for TPU support. (#2982) · e8ce63ff
      srush authored
      * change masking to direct labelings
      
      * fix black
      
      * switch to ignore index
      
      * .
      
      * fix black
      e8ce63ff
  17. 21 Feb, 2020 1 commit
  18. 13 Feb, 2020 1 commit
  19. 11 Feb, 2020 1 commit
    • Oleksiy Syvokon's avatar
      BERT decoder: Fix causal mask dtype. · ee5de0ba
      Oleksiy Syvokon authored
      PyTorch < 1.3 requires multiplication operands to be of the same type.
      This was violated when using default attention mask (i.e.,
      attention_mask=None in arguments) given BERT in the decoder mode.
      
      In particular, this was breaking Model2Model and made tutorial
      from the quickstart failing.
      ee5de0ba
  20. 07 Feb, 2020 1 commit
  21. 04 Feb, 2020 1 commit
  22. 03 Feb, 2020 1 commit
    • Lysandre's avatar
      [Follow up 213] · 239dd23f
      Lysandre authored
      Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
      239dd23f
  23. 28 Jan, 2020 1 commit
  24. 23 Jan, 2020 7 commits
  25. 15 Jan, 2020 1 commit
  26. 14 Jan, 2020 1 commit
    • Lysandre's avatar
      Bias should be resized with the weights · 100e3b6f
      Lysandre authored
      Created a link between the linear layer bias and the model attribute bias. This does not change anything for the user nor for the conversion scripts, but allows the `resize_token_embeddings` method to resize the bias as well as the weights of the decoder.
      
      Added a test.
      100e3b6f
  27. 07 Jan, 2020 2 commits
  28. 06 Jan, 2020 1 commit