1. 16 Nov, 2020 1 commit
    • Sylvain Gugger's avatar
      Switch `return_dict` to `True` by default. (#8530) · 1073a2bd
      Sylvain Gugger authored
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Run on the real suite
      
      * Fix slow tests
      1073a2bd
  2. 09 Nov, 2020 1 commit
  3. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 🎉
      
      * and removed hard dependency on tokenizers 🎉
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  4. 26 Aug, 2020 1 commit
  5. 24 Aug, 2020 1 commit
  6. 20 Aug, 2020 1 commit
  7. 04 Aug, 2020 1 commit
  8. 31 Jul, 2020 1 commit
  9. 01 Jul, 2020 1 commit
  10. 23 Jun, 2020 2 commits
  11. 19 Jun, 2020 1 commit
    • Vasily Shamporov's avatar
      Add MobileBert (#4901) · 9a3f9108
      Vasily Shamporov authored
      
      
      * Add MobileBert
      
      * Quality + Conversion script
      
      * style
      
      * Update src/transformers/modeling_mobilebert.py
      
      * Links to S3
      
      * Style
      
      * TFMobileBert
      
      Slight fixes to the pytorch MobileBert
      Style
      
      * MobileBertForMaskedLM (PT + TF)
      
      * MobileBertForNextSentencePrediction (PT + TF)
      
      * MobileFor{MultipleChoice, TokenClassification} (PT + TF)
      
      
      ss
      
      * Tests + Auto
      
      * Doc
      
      * Tests
      
      * Addressing @sgugger's comments
      
      * Adressing @patrickvonplaten's comments
      
      * Style
      
      * Style
      
      * Integration test
      
      * style
      
      * Model card
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      9a3f9108
  12. 12 Jun, 2020 1 commit
  13. 10 Jun, 2020 2 commits
  14. 05 Jun, 2020 1 commit
  15. 02 Jun, 2020 1 commit
    • Julien Chaumond's avatar
      Kill model archive maps (#4636) · d4c2cb40
      Julien Chaumond authored
      * Kill model archive maps
      
      * Fixup
      
      * Also kill model_archive_map for MaskedBertPreTrainedModel
      
      * Unhook config_archive_map
      
      * Tokenizers: align with model id changes
      
      * make style && make quality
      
      * Fix CI
      d4c2cb40
  16. 04 May, 2020 1 commit
  17. 01 May, 2020 1 commit
    • Julien Chaumond's avatar
      [ci] Load pretrained models into the default (long-lived) cache · f54dc3f4
      Julien Chaumond authored
      There's an inconsistency right now where:
      - we load some models into CACHE_DIR
      - and some models in the default cache
      - and often, in both for the same models
      
      When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth.
      
      I'd rather always use the default cache
      f54dc3f4
  18. 11 Feb, 2020 1 commit
    • Oleksiy Syvokon's avatar
      BERT decoder: Fix causal mask dtype. · ee5de0ba
      Oleksiy Syvokon authored
      PyTorch < 1.3 requires multiplication operands to be of the same type.
      This was violated when using default attention mask (i.e.,
      attention_mask=None in arguments) given BERT in the decoder mode.
      
      In particular, this was breaking Model2Model and made tutorial
      from the quickstart failing.
      ee5de0ba
  19. 06 Jan, 2020 2 commits
  20. 22 Dec, 2019 6 commits
  21. 21 Dec, 2019 2 commits
    • Aymeric Augustin's avatar
      Reformat source code with black. · fa84ae26
      Aymeric Augustin authored
      This is the result of:
      
          $ black --line-length 119 examples templates transformers utils hubconf.py setup.py
      
      There's a lot of fairly long lines in the project. As a consequence, I'm
      picking the longest widely accepted line length, 119 characters.
      
      This is also Thomas' preference, because it allows for explicit variable
      names, to make the code easier to understand.
      fa84ae26
    • Aymeric Augustin's avatar
      Take advantage of the cache when running tests. · b670c266
      Aymeric Augustin authored
      Caching models across test cases and across runs of the test suite makes
      slow tests somewhat more bearable.
      
      Use gettempdir() instead of /tmp in tests. This makes it easier to
      change the location of the cache with semi-standard TMPDIR/TEMP/TMP
      environment variables.
      
      Fix #2222.
      b670c266
  22. 13 Dec, 2019 1 commit
  23. 06 Dec, 2019 1 commit
    • Aymeric Augustin's avatar
      Remove dependency on pytest for running tests (#2055) · 35401fe5
      Aymeric Augustin authored
      * Switch to plain unittest for skipping slow tests.
      
      Add a RUN_SLOW environment variable for running them.
      
      * Switch to plain unittest for PyTorch dependency.
      
      * Switch to plain unittest for TensorFlow dependency.
      
      * Avoid leaking open files in the test suite.
      
      This prevents spurious warnings when running tests.
      
      * Fix unicode warning on Python 2 when running tests.
      
      The warning was:
      
          UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
      
      * Support running PyTorch tests on a GPU.
      
      Reverts 27e015bd.
      
      * Tests no longer require pytest.
      
      * Make tests pass on cuda
      35401fe5
  24. 06 Nov, 2019 1 commit
  25. 30 Oct, 2019 1 commit
  26. 16 Oct, 2019 1 commit
  27. 10 Oct, 2019 1 commit
  28. 08 Oct, 2019 3 commits
  29. 26 Sep, 2019 1 commit