1. 01 Apr, 2020 1 commit
  2. 25 Feb, 2020 2 commits
    • Lysandre Debut's avatar
      Documentation (#2989) · bb7c4685
      Lysandre Debut authored
      * All Tokenizers
      
      BertTokenizer + few fixes
      RobertaTokenizer
      OpenAIGPTTokenizer + Fixes
      GPT2Tokenizer + fixes
      TransfoXLTokenizer
      Correct rst for TransformerXL
      XLMTokenizer + fixes
      XLNet Tokenizer + Style
      DistilBERT + Fix XLNet RST
      CTRLTokenizer
      CamemBERT Tokenizer
      FlaubertTokenizer
      XLMRobertaTokenizer
      cleanup
      
      * cleanup
      bb7c4685
    • srush's avatar
      Change masking to direct labeling for TPU support. (#2982) · e8ce63ff
      srush authored
      * change masking to direct labelings
      
      * fix black
      
      * switch to ignore index
      
      * .
      
      * fix black
      e8ce63ff
  3. 21 Feb, 2020 1 commit
  4. 13 Feb, 2020 1 commit
  5. 11 Feb, 2020 1 commit
    • Oleksiy Syvokon's avatar
      BERT decoder: Fix causal mask dtype. · ee5de0ba
      Oleksiy Syvokon authored
      PyTorch < 1.3 requires multiplication operands to be of the same type.
      This was violated when using default attention mask (i.e.,
      attention_mask=None in arguments) given BERT in the decoder mode.
      
      In particular, this was breaking Model2Model and made tutorial
      from the quickstart failing.
      ee5de0ba
  6. 07 Feb, 2020 1 commit
  7. 04 Feb, 2020 1 commit
  8. 03 Feb, 2020 1 commit
    • Lysandre's avatar
      [Follow up 213] · 239dd23f
      Lysandre authored
      Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
      239dd23f
  9. 28 Jan, 2020 1 commit
  10. 23 Jan, 2020 7 commits
  11. 15 Jan, 2020 1 commit
  12. 14 Jan, 2020 1 commit
    • Lysandre's avatar
      Bias should be resized with the weights · 100e3b6f
      Lysandre authored
      Created a link between the linear layer bias and the model attribute bias. This does not change anything for the user nor for the conversion scripts, but allows the `resize_token_embeddings` method to resize the bias as well as the weights of the decoder.
      
      Added a test.
      100e3b6f
  13. 07 Jan, 2020 2 commits
  14. 06 Jan, 2020 3 commits
  15. 22 Dec, 2019 6 commits
  16. 21 Dec, 2019 1 commit
    • Aymeric Augustin's avatar
      Reformat source code with black. · fa84ae26
      Aymeric Augustin authored
      This is the result of:
      
          $ black --line-length 119 examples templates transformers utils hubconf.py setup.py
      
      There's a lot of fairly long lines in the project. As a consequence, I'm
      picking the longest widely accepted line length, 119 characters.
      
      This is also Thomas' preference, because it allows for explicit variable
      names, to make the code easier to understand.
      fa84ae26
  17. 18 Dec, 2019 3 commits
  18. 11 Dec, 2019 2 commits
  19. 10 Dec, 2019 3 commits
  20. 09 Dec, 2019 1 commit
    • Rémi Louf's avatar
      create encoder attention mask from shape of hidden states · 3520be78
      Rémi Louf authored
      We currently create encoder attention masks (when they're not provided)
      based on the shape of the inputs to the encoder. This is obviously
      wrong; sequences can be of different lengths. We now create the encoder
      attention mask based on the batch_size and sequence_length of the
      encoder hidden states.
      3520be78