1. 26 Oct, 2020 4 commits
  2. 23 Oct, 2020 3 commits
  3. 22 Oct, 2020 7 commits
  4. 21 Oct, 2020 5 commits
  5. 20 Oct, 2020 1 commit
  6. 19 Oct, 2020 7 commits
  7. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 馃帀
      
      * and removed hard dependency on tokenizers 馃帀
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  8. 16 Oct, 2020 4 commits
    • Stas Bekman's avatar
      fix/hide warnings (#7837) · d8ca57d2
      Stas Bekman authored
      s
      d8ca57d2
    • Sam Shleifer's avatar
      [cleanup] assign todos, faster bart-cnn test (#7835) · 96e47d92
      Sam Shleifer authored
      * 2 beam output
      
      * unassign/remove TODOs
      
      * remove one more
      96e47d92
    • rmroczkowski's avatar
      Herbert polish model (#7798) · 7b13bd01
      rmroczkowski authored
      
      
      * HerBERT transformer model for Polish language understanding.
      
      * HerbertTokenizerFast generated with HerbertConverter
      
      * Herbert base and large model cards
      
      * Herbert model cards with tags
      
      * Herbert tensorflow models
      
      * Herbert model tests based on Bert test suit
      
      * src/transformers/tokenization_herbert.py edited online with Bitbucket
      
      * src/transformers/tokenization_herbert.py edited online with Bitbucket
      
      * docs/source/model_doc/herbert.rst edited online with Bitbucket
      
      * Herbert tokenizer tests and bug fixes
      
      * src/transformers/configuration_herbert.py edited online with Bitbucket
      
      * Copyrights and tests for TFHerbertModel
      
      * model_cards/allegro/herbert-base-cased/README.md edited online with Bitbucket
      
      * model_cards/allegro/herbert-large-cased/README.md edited online with Bitbucket
      
      * Bug fixes after testing
      
      * Reformat modified_only_fixup
      
      * Proper order of configuration
      
      * Herbert proper documentation formatting
      
      * Formatting with make modified_only_fixup
      
      * Dummies fixed
      
      * Adding missing models to documentation
      
      * Removing HerBERT model as it is a simple extension of BERT
      
      * Update model_cards/allegro/herbert-base-cased/README.md
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      
      * Update model_cards/allegro/herbert-large-cased/README.md
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      
      * HerbertTokenizer deprecated configuration removed
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      7b13bd01
    • Lysandre Debut's avatar
      Fix DeBERTa integration tests (#7729) · 52c9e842
      Lysandre Debut authored
      52c9e842
  9. 15 Oct, 2020 1 commit
  10. 14 Oct, 2020 2 commits
  11. 13 Oct, 2020 4 commits
  12. 10 Oct, 2020 1 commit