1. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 馃帀
      
      * and removed hard dependency on tokenizers 馃帀
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  2. 15 Oct, 2020 1 commit
    • Stas Bekman's avatar
      fix DeprecationWarning (#7834) · a5a8eeb7
      Stas Bekman authored
      in `tests/test_utils_check_copies.py` I was getting intermittently:
      ```
      utils/check_copies.py:52
        /mnt/nvme1/code/transformers-comet/utils/check_copies.py:52: DeprecationWarning: invalid escape sequence \s
          while line_index < len(lines) and re.search(f"^{indent}(class|def)\s+{name}", lines[line_index]) is None:
      ```
      So this should fix it.
      a5a8eeb7
  3. 09 Oct, 2020 3 commits
  4. 05 Oct, 2020 2 commits
  5. 30 Sep, 2020 1 commit
  6. 28 Sep, 2020 1 commit
  7. 25 Sep, 2020 1 commit
  8. 24 Sep, 2020 2 commits
  9. 22 Sep, 2020 2 commits
  10. 10 Sep, 2020 1 commit
    • Patrick von Platen's avatar
      Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594) · 7fd1febf
      Patrick von Platen authored
      * add conversion script
      
      * improve conversion script
      
      * make style
      
      * add tryout files
      
      * fix
      
      * update
      
      * add causal bert
      
      * better names
      
      * add tokenizer file as well
      
      * finish causal_bert
      
      * fix small bugs
      
      * improve generate
      
      * change naming
      
      * renaming
      
      * renaming
      
      * renaming
      
      * remove leftover files
      
      * clean files
      
      * add fix tokenizer
      
      * finalize
      
      * correct slow test
      
      * update docs
      
      * small fixes
      
      * fix link
      
      * adapt check repo
      
      * apply sams and sylvains recommendations
      
      * fix import
      
      * implement Lysandres recommendations
      
      * fix logger warn
      7fd1febf
  11. 08 Sep, 2020 1 commit
  12. 26 Aug, 2020 1 commit
  13. 14 Aug, 2020 1 commit
    • Suraj Patil's avatar
      MBartForConditionalGeneration (#6441) · 680f1337
      Suraj Patil authored
      * add MBartForConditionalGeneration
      
      * style
      
      * rebase and fixes
      
      * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS
      
      * fix docs
      
      * don't ignore mbart
      
      * doc
      
      * fix mbart fairseq link
      
      * put mbart before bart
      
      * apply doc suggestions
      680f1337
  14. 12 Aug, 2020 1 commit
  15. 10 Aug, 2020 1 commit
    • Lysandre Debut's avatar
      Patch models (#6326) · b99098ab
      Lysandre Debut authored
      * TFAlbertFor{TokenClassification, MultipleChoice}
      
      * Patch models
      
      * BERT and TF BERT info
      
      
      s
      
      * Update check_repo
      b99098ab
  16. 07 Aug, 2020 1 commit
  17. 02 Mar, 2020 1 commit
    • Julien Chaumond's avatar
      TF GPU CI (#3085) · f169957d
      Julien Chaumond authored
      * debug env
      
      * Restrict TF GPU memory
      
      * Fixup
      
      * One more test
      
      * rm debug logs
      
      * Fixup
      f169957d
  18. 06 Jan, 2020 2 commits
  19. 22 Dec, 2019 3 commits
  20. 21 Dec, 2019 1 commit
    • Aymeric Augustin's avatar
      Reformat source code with black. · fa84ae26
      Aymeric Augustin authored
      This is the result of:
      
          $ black --line-length 119 examples templates transformers utils hubconf.py setup.py
      
      There's a lot of fairly long lines in the project. As a consequence, I'm
      picking the longest widely accepted line length, 119 characters.
      
      This is also Thomas' preference, because it allows for explicit variable
      names, to make the code easier to understand.
      fa84ae26
  21. 06 Dec, 2019 1 commit
  22. 03 Dec, 2019 1 commit
  23. 29 Nov, 2019 1 commit
  24. 21 Nov, 2019 1 commit