1. 02 Feb, 2021 1 commit
  2. 27 Jan, 2021 1 commit
  3. 18 Jan, 2021 1 commit
  4. 14 Jan, 2021 1 commit
  5. 13 Jan, 2021 3 commits
  6. 12 Jan, 2021 2 commits
  7. 06 Jan, 2021 1 commit
    • Sylvain Gugger's avatar
      Fast transformers import part 1 (#9441) · 0c96262f
      Sylvain Gugger authored
      * Don't import libs to check they are available
      
      * Don't import integrations at init
      
      * Add importlib_metdata to deps
      
      * Remove old vars references
      
      * Avoid syntax error
      
      * Adapt testing utils
      
      * Try to appease torchhub
      
      * Add dependency
      
      * Remove more private variables
      
      * Fix typo
      
      * Another typo
      
      * Refine the tf availability test
      0c96262f
  8. 21 Dec, 2020 1 commit
  9. 18 Dec, 2020 1 commit
    • Stas Bekman's avatar
      [setup] correct transformers version format (#9176) · 84d5879e
      Stas Bekman authored
      setuptools has a pretty fixed expectation of version numbers.
      
      This PR fixes the dev version number and adds a comment with correct formats for the future editors
      
      This fix removes this warning on `make fixup|style|etc` or any other time `setup.py` is being run.
      ```
      setuptools/dist.py:452: UserWarning: Normalizing '4.2.0dev0' to '4.2.0.dev0'
        warnings.warn(tmpl.format(**locals()))
      ```
      and the alternative:
      ```
      /setuptools/dist.py:452: UserWarning: Normalizing '4.0.0-rc-1' to '4.0.0rc1
      ```
      
      Fixes: #8749
      
      @LysandreJik, @sgugger
      84d5879e
  10. 17 Dec, 2020 3 commits
  11. 16 Dec, 2020 1 commit
    • Patrick von Platen's avatar
      [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1
      Patrick von Platen authored
      
      
      * save intermediate
      
      * save intermediate
      
      * save intermediate
      
      * correct flax bert model file
      
      * new module / model naming
      
      * make style
      
      * almost finish BERT
      
      * finish roberta
      
      * make fix-copies
      
      * delete keys file
      
      * last refactor
      
      * fixes in run_mlm_flax.py
      
      * remove pooled from run_mlm_flax.py`
      
      * fix gelu | gelu_new
      
      * remove Module from inits
      
      * splits
      
      * dirty print
      
      * preventing warmup_steps == 0
      
      * smaller splits
      
      * make fix-copies
      
      * dirty print
      
      * dirty print
      
      * initial_evaluation argument
      
      * declaration order fix
      
      * proper model initialization/loading
      
      * proper initialization
      
      * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug
      
      * removed tokenizers warning hack, fixed model re-initialization
      
      * reverted training_args.py changes
      
      * fix flax from pretrained
      
      * improve test in flax
      
      * apply sylvains tips
      
      * update init
      
      * make 0.3.0 compatible
      
      * revert tevens changes
      
      * revert tevens changes 2
      
      * finalize revert
      
      * fix bug
      
      * add docs
      
      * add pretrained to init
      
      * Update src/transformers/modeling_flax_utils.py
      
      * fix copies
      
      * final improvements
      Co-authored-by: default avatarTevenLeScao <teven.lescao@gmail.com>
      640e6fe1
  12. 15 Dec, 2020 1 commit
  13. 14 Dec, 2020 2 commits
  14. 07 Dec, 2020 1 commit
  15. 30 Nov, 2020 2 commits
  16. 27 Nov, 2020 1 commit
  17. 24 Nov, 2020 1 commit
  18. 19 Nov, 2020 1 commit
  19. 16 Nov, 2020 1 commit
  20. 15 Nov, 2020 1 commit
    • Thomas Wolf's avatar
      [breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests... · f4e04cd2
      Thomas Wolf authored
      
      [breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073)
      
      * Fixing roberta for slow-fast tests
      
      * WIP getting equivalence on pipelines
      
      * slow-to-fast equivalence - working on question-answering pipeline
      
      * optional FAISS tests
      
      * Pipeline Q&A
      
      * Move pipeline tests to their own test job again
      
      * update tokenizer to add sequence id methods
      
      * update to tokenizers 0.9.4
      
      * set sentencepiecce as optional
      
      * clean up squad
      
      * clean up pipelines to use sequence_ids
      
      * style/quality
      
      * wording
      
      * Switch to use_fast = True by default
      
      * update tests for use_fast at True by default
      
      * fix rag tokenizer test
      
      * removing protobuf from required dependencies
      
      * fix NER test for use_fast = True by default
      
      * fixing example tests (Q&A examples use slow tokenizers for now)
      
      * protobuf in main deps extras["sentencepiece"] and example deps
      
      * fix protobug install test
      
      * try to fix seq2seq by switching to slow tokenizers for now
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      f4e04cd2
  21. 13 Nov, 2020 1 commit
    • Lysandre Debut's avatar
      Model templates encoder only (#8509) · 826f0457
      Lysandre Debut authored
      
      
      * Model templates
      
      * TensorFlow
      
      * Remove pooler
      
      * CI
      
      * Tokenizer + Refactoring
      
      * Encoder-Decoder
      
      * Let's go testing
      
      * Encoder-Decoder in TF
      
      * Let's go testing in TF
      
      * Documentation
      
      * README
      
      * Fixes
      
      * Better names
      
      * Style
      
      * Update docs
      
      * Choose to skip either TF or PT
      
      * Code quality fixes
      
      * Add to testing suite
      
      * Update file path
      
      * Cookiecutter path
      
      * Update `transformers` path
      
      * Handle rebasing
      
      * Remove seq2seq from model templates
      
      * Remove s2s config
      
      * Apply Sylvain and Patrick comments
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Last fixes from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      826f0457
  22. 10 Nov, 2020 1 commit
  23. 09 Nov, 2020 1 commit
  24. 04 Nov, 2020 1 commit
  25. 27 Oct, 2020 3 commits
  26. 20 Oct, 2020 1 commit
  27. 19 Oct, 2020 1 commit
  28. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 馃帀
      
      * and removed hard dependency on tokenizers 馃帀
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  29. 09 Oct, 2020 3 commits