"tests/models/xlm/test_tokenization_xlm.py" did not exist on "32dbb2d954d646f3307f66c889c6c418a40acf88"
  1. 13 Jan, 2021 1 commit
  2. 17 Dec, 2020 2 commits
  3. 15 Dec, 2020 1 commit
    • NielsRogge's avatar
      [WIP] Tapas v4 (tres) (#9117) · 1551e2dc
      NielsRogge authored
      
      
      * First commit: adding all files from tapas_v3
      
      * Fix multiple bugs including soft dependency and new structure of the library
      
      * Improve testing by adding torch_device to inputs and adding dependency on scatter
      
      * Use Python 3 inheritance rather than Python 2
      
      * First draft model cards of base sized models
      
      * Remove model cards as they are already on the hub
      
      * Fix multiple bugs with integration tests
      
      * All model integration tests pass
      
      * Remove print statement
      
      * Add test for convert_logits_to_predictions method of TapasTokenizer
      
      * Incorporate suggestions by Google authors
      
      * Fix remaining tests
      
      * Change position embeddings sizes to 512 instead of 1024
      
      * Comment out positional embedding sizes
      
      * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
      
      * Added more model names
      
      * Fix truncation when no max length is specified
      
      * Disable torchscript test
      
      * Make style & make quality
      
      * Quality
      
      * Address CI needs
      
      * Test the Masked LM model
      
      * Fix the masked LM model
      
      * Truncate when overflowing
      
      * More much needed docs improvements
      
      * Fix some URLs
      
      * Some more docs improvements
      
      * Test PyTorch scatter
      
      * Set to slow + minify
      
      * Calm flake8 down
      
      * First commit: adding all files from tapas_v3
      
      * Fix multiple bugs including soft dependency and new structure of the library
      
      * Improve testing by adding torch_device to inputs and adding dependency on scatter
      
      * Use Python 3 inheritance rather than Python 2
      
      * First draft model cards of base sized models
      
      * Remove model cards as they are already on the hub
      
      * Fix multiple bugs with integration tests
      
      * All model integration tests pass
      
      * Remove print statement
      
      * Add test for convert_logits_to_predictions method of TapasTokenizer
      
      * Incorporate suggestions by Google authors
      
      * Fix remaining tests
      
      * Change position embeddings sizes to 512 instead of 1024
      
      * Comment out positional embedding sizes
      
      * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
      
      * Added more model names
      
      * Fix truncation when no max length is specified
      
      * Disable torchscript test
      
      * Make style & make quality
      
      * Quality
      
      * Address CI needs
      
      * Test the Masked LM model
      
      * Fix the masked LM model
      
      * Truncate when overflowing
      
      * More much needed docs improvements
      
      * Fix some URLs
      
      * Some more docs improvements
      
      * Add add_pooling_layer argument to TapasModel
      
      Fix comments by @sgugger and @patrickvonplaten
      
      * Fix issue in docs + fix style and quality
      
      * Clean up conversion script and add task parameter to TapasConfig
      
      * Revert the task parameter of TapasConfig
      
      Some minor fixes
      
      * Improve conversion script and add test for absolute position embeddings
      
      * Improve conversion script and add test for absolute position embeddings
      
      * Fix bug with reset_position_index_per_cell arg of the conversion cli
      
      * Add notebooks to the examples directory and fix style and quality
      
      * Apply suggestions from code review
      
      * Move from `nielsr/` to `google/` namespace
      
      * Apply Sylvain's comments
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarRogge Niels <niels.rogge@howest.be>
      Co-authored-by: default avatarLysandreJik <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      1551e2dc
  4. 11 Dec, 2020 2 commits
  5. 09 Dec, 2020 1 commit
    • Stas Bekman's avatar
      [wip] [ci] doc-job-skip take #4 dry-run (#8980) · 5e637e6c
      Stas Bekman authored
      * ci-doc-job-skip-take-4
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * skip yaml
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * ready to test
      
      * yet another way
      
      * trying with HEAD
      
      * trying with head.sha
      
      * trying with head.sha fix
      
      * trying with head.sha fix wip
      
      * undo
      
      * try to switch to sha
      
      * current branch
      
      * current branch
      
      * PR number check
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      
      * joy ride
      5e637e6c
  6. 08 Dec, 2020 1 commit
  7. 07 Dec, 2020 2 commits
  8. 04 Dec, 2020 1 commit
  9. 02 Dec, 2020 1 commit
  10. 01 Dec, 2020 2 commits
  11. 30 Nov, 2020 2 commits
  12. 29 Nov, 2020 1 commit
    • Stas Bekman's avatar
      [CI] implement job skipping for doc-only PRs (#8826) · c239dcda
      Stas Bekman authored
      * implement job skipping for doc-only PRs
      
      * silent grep is crucial
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * wip
      
      * let's add doc
      
      * let's add code
      
      * revert test commits
      
      * restore
      
      * Better name
      
      * Better name
      
      * Better name
      
      * some more testing
      
      * some more testing
      
      * some more testing
      
      * finish testing
      c239dcda
  13. 23 Nov, 2020 1 commit
    • Julien Chaumond's avatar
      Improve bert-japanese tokenizer handling (#8659) · 0cc5ab13
      Julien Chaumond authored
      
      
      * Make ci fail
      
      * Try to make tests actually run?
      
      * CI finally failing?
      
      * Fix CI
      
      * Revert "Fix CI"
      
      This reverts commit ca7923be7334d4e571b023478ebdd6b33dfd0ebb.
      
      * Ooops wrong one
      
      * one more try
      
      * Ok ok let's move this elsewhere
      
      * Alternative to globals() (#8667)
      
      * Alternative to globals()
      
      * Error is raised later so return None
      
      * Sentencepiece not installed make some tokenizers None
      
      * Apply Lysandre wisdom
      
      * Slightly clearer comment?
      
      cc @sgugger
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      0cc5ab13
  14. 19 Nov, 2020 1 commit
  15. 13 Nov, 2020 1 commit
  16. 11 Nov, 2020 2 commits
  17. 10 Nov, 2020 1 commit
  18. 04 Nov, 2020 2 commits
  19. 03 Nov, 2020 2 commits
  20. 29 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Add a template for examples and apply it for mlm and plm examples (#8153) · 69117628
      Sylvain Gugger authored
      * Add a template for example scripts and apply it to mlm
      
      * Formatting
      
      * Fix test
      
      * Add plm script
      
      * Add a template for example scripts and apply it to mlm
      
      * Formatting
      
      * Fix test
      
      * Add plm script
      
      * Add a template for example scripts and apply it to mlm
      
      * Formatting
      
      * Fix test
      
      * Add plm script
      
      * Styling
      69117628
  21. 28 Oct, 2020 1 commit
  22. 27 Oct, 2020 2 commits
  23. 26 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Doc styling (#8067) · 08f534d2
      Sylvain Gugger authored
      * Important files
      
      * Styling them all
      
      * Revert "Styling them all"
      
      This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
      
      * Syling them for realsies
      
      * Fix syntax error
      
      * Fix benchmark_utils
      
      * More fixes
      
      * Fix modeling auto and script
      
      * Remove new line
      
      * Fixes
      
      * More fixes
      
      * Fix more files
      
      * Style
      
      * Add FSMT
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * Fixes
      
      * More fixes
      
      * More fixes
      
      * Last fixes
      
      * Make sphinx happy
      08f534d2
  24. 23 Oct, 2020 1 commit
  25. 22 Oct, 2020 1 commit
  26. 20 Oct, 2020 2 commits
  27. 19 Oct, 2020 2 commits
  28. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 馃帀
      
      * and removed hard dependency on tokenizers 馃帀
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  29. 05 Oct, 2020 1 commit