1. 07 Dec, 2020 2 commits
  2. 04 Dec, 2020 1 commit
  3. 02 Dec, 2020 1 commit
  4. 01 Dec, 2020 2 commits
  5. 30 Nov, 2020 1 commit
  6. 29 Nov, 2020 1 commit
    • Stas Bekman's avatar
      [CI] implement job skipping for doc-only PRs (#8826) · c239dcda
      Stas Bekman authored
      * implement job skipping for doc-only PRs
      
      * silent grep is crucial
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * let's add doc
      
      * let's add code
      
      * revert test commits
      
      * restore
      
      * Better name
      
      * Better name
      
      * Better name
      
      * some more testing
      
      * some more testing
      
      * some more testing
      
      * finish testing
      c239dcda
  7. 23 Nov, 2020 1 commit
    • Julien Chaumond's avatar
      Improve bert-japanese tokenizer handling (#8659) · 0cc5ab13
      Julien Chaumond authored
      
      
      * Make ci fail
      
      * Try to make tests actually run?
      
      * CI finally failing?
      
      * Fix CI
      
      * Revert "Fix CI"
      
      This reverts commit ca7923be7334d4e571b023478ebdd6b33dfd0ebb.
      
      * Ooops wrong one
      
      * one more try
      
      * Ok ok let's move this elsewhere
      
      * Alternative to globals() (#8667)
      
      * Alternative to globals()
      
      * Error is raised later so return None
      
      * Sentencepiece not installed make some tokenizers None
      
      * Apply Lysandre wisdom
      
      * Slightly clearer comment?
      
      cc @sgugger
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      0cc5ab13
  8. 19 Nov, 2020 1 commit
  9. 11 Nov, 2020 2 commits
  10. 04 Nov, 2020 2 commits
  11. 03 Nov, 2020 2 commits
  12. 29 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Add a template for examples and apply it for mlm and plm examples (#8153) · 69117628
      Sylvain Gugger authored
      * Add a template for example scripts and apply it to mlm
      
      * Formatting
      
      * Fix test
      
      * Add plm script
      
      * Add a template for example scripts and apply it to mlm
      
      * Formatting
      
      * Fix test
      
      * Add plm script
      
      * Add a template for example scripts and apply it to mlm
      
      * Formatting
      
      * Fix test
      
      * Add plm script
      
      * Styling
      69117628
  13. 28 Oct, 2020 1 commit
  14. 27 Oct, 2020 2 commits
  15. 26 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Doc styling (#8067) · 08f534d2
      Sylvain Gugger authored
      * Important files
      
      * Styling them all
      
      * Revert "Styling them all"
      
      This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
      
      * Syling them for realsies
      
      * Fix syntax error
      
      * Fix benchmark_utils
      
      * More fixes
      
      * Fix modeling auto and script
      
      * Remove new line
      
      * Fixes
      
      * More fixes
      
      * Fix more files
      
      * Style
      
      * Add FSMT
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * Fixes
      
      * More fixes
      
      * More fixes
      
      * Last fixes
      
      * Make sphinx happy
      08f534d2
  16. 23 Oct, 2020 1 commit
  17. 20 Oct, 2020 1 commit
  18. 19 Oct, 2020 2 commits
  19. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 馃帀
      
      * and removed hard dependency on tokenizers 馃帀
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  20. 05 Oct, 2020 1 commit
  21. 23 Sep, 2020 1 commit
    • Stas Bekman's avatar
      [code quality] fix confused flake8 (#7309) · df536438
      Stas Bekman authored
      * fix confused flake
      
      We run `black  --target-version py35 ...` but flake8 doesn't know that, so currently with py38 flake8 fails suggesting that black should have reformatted 63 files. Indeed if I run:
      
      ```
      black --line-length 119 --target-version py38 examples templates tests src utils
      ```
      it indeed reformats 63 files.
      
      The only solution I found is to create a black config file as explained at https://github.com/psf/black#configuration-format, which is what this PR adds.
      
      Now flake8 knows that py35 is the standard and no longer gets confused regardless of the user's python version.
      
      * adjust the other files that will now rely on black's config file
      df536438
  22. 22 Sep, 2020 1 commit
  23. 17 Sep, 2020 1 commit
    • Stas Bekman's avatar
      remove deprecated flag (#7171) · 79111b77
      Stas Bekman authored
      ```
      /home/circleci/.local/lib/python3.6/site-packages/isort/main.py:915: UserWarning: W0501: The following deprecated CLI flags were used and ignored: --recursive!
        "W0501: The following deprecated CLI flags were used and ignored: "
      ```
      79111b77
  24. 10 Sep, 2020 1 commit
  25. 01 Sep, 2020 1 commit
  26. 25 Aug, 2020 1 commit
  27. 24 Aug, 2020 1 commit
  28. 17 Aug, 2020 1 commit
  29. 12 Aug, 2020 2 commits
  30. 11 Aug, 2020 2 commits
  31. 10 Aug, 2020 1 commit