1. 27 Nov, 2020 4 commits
  2. 25 Nov, 2020 3 commits
  3. 24 Nov, 2020 5 commits
    • Julien Plu's avatar
      New TF model inputs (#8602) · 29d49924
      Julien Plu authored
      * Apply on BERT and ALBERT
      
      * Update TF Bart
      
      * Add input processing to TF BART
      
      * Add input processing for TF CTRL
      
      * Add input processing to TF Distilbert
      
      * Add input processing to TF DPR
      
      * Add input processing to TF Electra
      
      * Add input processing for TF Flaubert
      
      * Add deprecated arguments
      
      * Add input processing to TF XLM
      
      * remove unused imports
      
      * Add input processing to TF Funnel
      
      * Add input processing to TF GPT2
      
      * Add input processing to TF Longformer
      
      * Add input processing to TF Lxmert
      
      * Apply style
      
      * Add input processing to TF Mobilebert
      
      * Add input processing to TF GPT
      
      * Add input processing to TF Roberta
      
      * Add input processing to TF T5
      
      * Add input processing to TF TransfoXL
      
      * Apply style
      
      * Rebase on master
      
      * Bug fix
      
      * Retry to bugfix
      
      * Retry bug fix
      
      * Fix wrong model name
      
      * Try another fix
      
      * Fix BART
      
      * Fix input precessing
      
      * Apply style
      
      * Put the deprecated warnings in the input processing function
      
      * Remove the unused imports
      
      * Raise an error when len(kwargs)>0
      
      * test ModelOutput instead of TFBaseModelOutput
      
      * Bug fix
      
      * Address Patrick's comments
      
      * Address Patrick's comments
      
      * Address Sylvain's comments
      
      * Add the new inputs in new Longformer models
      
      * Update the template with the new input processing
      
      * Remove useless assert
      
      * Apply style
      
      * Trigger CI
      29d49924
    • Stas Bekman's avatar
      [core] implement support for run-time dependency version checking (#8645) · 82d443a7
      Stas Bekman authored
      
      
      * implement support for run-time dependency version checking
      
      * try not escaping !
      
      * use findall that works on py36
      
      * small tweaks
      
      * autoformatter worship
      
      * simplify
      
      * shorter names
      
      * add support for non-versioned checks
      
      * add deps
      
      * revert
      
      * tokenizers not required, check version only if installed
      
      * make a proper distutils cmd and add make target
      
      * tqdm must be checked before tokenizers
      
      * workaround the DistributionNotFound peculiar setup
      
      * handle the rest of packages in setup.py
      
      * fully sync setup.py's install_requires - to check them all
      
      * nit
      
      * make install_requires more readable
      
      * typo
      
      * Update setup.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * restyle
      
      * add types
      
      * simplify
      
      * simplify2
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      82d443a7
    • Lysandre Debut's avatar
      MT5 should have an autotokenizer (#8743) · e09e54fd
      Lysandre Debut authored
      * MT5 should have an autotokenizer
      
      * Different configurations should be able to point to same tokenizers
      e09e54fd
    • Lysandre Debut's avatar
      Fix slow tests v2 (#8746) · 6fdd0bb2
      Lysandre Debut authored
      * Fix BART test
      
      * Fix MBART tests
      
      * Remove erroneous line from yaml
      
      * Update tests/test_modeling_bart.py
      
      * Quality
      6fdd0bb2
    • zhiheng-huang's avatar
      Support various BERT relative position embeddings (2nd) (#8276) · 2c83b3c3
      zhiheng-huang authored
      
      
      * Support BERT relative position embeddings
      
      * Fix typo in README.md
      
      * Address review comment
      
      * Fix failing tests
      
      * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py
      
      * make fix copies
      
      * fix configs of electra and albert and fix longformer
      
      * remove copy statement from longformer
      
      * fix albert
      
      * fix electra
      
      * Add bert variants forward tests for various position embeddings
      
      * [tiny] Fix style for test_modeling_bert.py
      
      * improve docstring
      
      * [tiny] improve docstring and remove unnecessary dependency
      
      * [tiny] Remove unused import
      
      * re-add to ALBERT
      
      * make embeddings work for ALBERT
      
      * add test for albert
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2c83b3c3
  4. 23 Nov, 2020 7 commits
    • LysandreJik's avatar
      TF BERT test update · 7f2c0091
      LysandreJik authored
      7f2c0091
    • LysandreJik's avatar
      Update TF BERT test · e1b7e10d
      LysandreJik authored
      e1b7e10d
    • Colin Brochtrup's avatar
      Add early stopping callback to pytorch trainer (#8581) · 8ffc01a7
      Colin Brochtrup authored
      * Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer
      
      * Add early stopping test
      
      * Set patience counter to 0 if best metric not defined yet
      
      * Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.
      
      * Run make style
      
      * make funciton name sensible
      
      * Improve new argument docstring wording and hope that flakey CI test passes.
      
      * Use on_evaluation callback instead of custom. Remove some debug printing
      
      * Move early stopping arguments and state into early stopping callback
      
      * Run make style
      
      * Remove old code
      
      * Fix docs formatting. make style went rogue on me.
      
      * Remove copied attributes and fix variable
      
      * Add assertions on training arguments instead of mutating them. Move comment out of public docs.
      
      * Make separate test for early stopping callback. Add test of invalid arguments.
      
      * Run make style... I remembered before CI this time!
      
      * appease flake8
      
      * Add EarlyStoppingCallback to callback docs
      
      * Make docstring EarlyStoppingCallabck match other callbacks.
      
      * Fix typo in docs
      8ffc01a7
    • Stas Bekman's avatar
      consistent ignore keys + make private (#8737) · e84786aa
      Stas Bekman authored
      * consistent ignore keys + make private
      
      * style
      
      * - authorized_missing_keys    => _keys_to_ignore_on_load_missing
        - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected
      
      * move public doc of private attributes to private comment
      e84786aa
    • alexorona's avatar
      gpt2 and t5 parallel modeling (#8696) · 1cd9be2a
      alexorona authored
      
      
      * gpt2 and t5 parallel modeling
      
      * model_parallel utils update
      
      * adding missing model_parallel_utils
      
      Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5
      
      * training_args reformat
      
      Reformatted training_args
      
      * style formatting
      
      Style formatting doc string length on training_args and model_parallel_utils
      
      * style changes
      
      make style && make quality for training_args and model_parallel_utils.
      
      * adding tests
      
      * minor change in trainer
      
      reverts loss calculation
      
      * Update training_args.py
      
      * Update training_args.py
      
      added back docstring language for adam_beta1 and adam_beta2
      
      * Update trainer.py
      
      * Update src/transformers/trainer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Fix style & rebase
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandreJik <lysandre.debut@reseau.eseo.fr>
      1cd9be2a
    • Julien Chaumond's avatar
      Improve bert-japanese tokenizer handling (#8659) · 0cc5ab13
      Julien Chaumond authored
      
      
      * Make ci fail
      
      * Try to make tests actually run?
      
      * CI finally failing?
      
      * Fix CI
      
      * Revert "Fix CI"
      
      This reverts commit ca7923be7334d4e571b023478ebdd6b33dfd0ebb.
      
      * Ooops wrong one
      
      * one more try
      
      * Ok ok let's move this elsewhere
      
      * Alternative to globals() (#8667)
      
      * Alternative to globals()
      
      * Error is raised later so return None
      
      * Sentencepiece not installed make some tokenizers None
      
      * Apply Lysandre wisdom
      
      * Slightly clearer comment?
      
      cc @sgugger
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      0cc5ab13
    • Yossi Synett's avatar
  5. 20 Nov, 2020 1 commit
  6. 19 Nov, 2020 6 commits
  7. 18 Nov, 2020 2 commits
  8. 17 Nov, 2020 4 commits
  9. 16 Nov, 2020 3 commits
    • Sylvain Gugger's avatar
      Switch `return_dict` to `True` by default. (#8530) · 1073a2bd
      Sylvain Gugger authored
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Use the CI to identify failing tests
      
      * Remove from all examples and tests
      
      * More default switch
      
      * Fixes
      
      * More test fixes
      
      * More fixes
      
      * Last fixes hopefully
      
      * Run on the real suite
      
      * Fix slow tests
      1073a2bd
    • LSinev's avatar
      Fix GPT2DoubleHeadsModel to work with model.generate() (#6601) · afb50c66
      LSinev authored
      * Fix passing token_type_ids during GPT2DoubleHeadsModel.generate() if used
      
      and for GPT2LMHeadModel too
      
      * Update tests to check token_type_ids usage in GPT2 models
      afb50c66
    • Yusuke Mori's avatar
      Adding the prepare_seq2seq_batch function to ProphetNet (#8515) · 04d8136b
      Yusuke Mori authored
      * Simply insert T5Tokenizer's prepare_seq2seq_batch
      
      * Update/Add some 'import'
      
      * fix RunTimeError caused by '.view'
      
      * Moves .view related error avoidance from seq2seq_trainer to inside prophetnet
      
      * Update test_tokenization_prophetnet.py
      
      * Format the test code with black
      
      * Re-format the test code
      
      * Update test_tokenization_prophetnet.py
      
      * Add importing require_torch in the test code
      
      * Add importing BatchEncoding in the test code
      
      * Re-format the test code on Colab
      04d8136b
  10. 15 Nov, 2020 1 commit
    • Thomas Wolf's avatar
      [breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests... · f4e04cd2
      Thomas Wolf authored
      
      [breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073)
      
      * Fixing roberta for slow-fast tests
      
      * WIP getting equivalence on pipelines
      
      * slow-to-fast equivalence - working on question-answering pipeline
      
      * optional FAISS tests
      
      * Pipeline Q&A
      
      * Move pipeline tests to their own test job again
      
      * update tokenizer to add sequence id methods
      
      * update to tokenizers 0.9.4
      
      * set sentencepiecce as optional
      
      * clean up squad
      
      * clean up pipelines to use sequence_ids
      
      * style/quality
      
      * wording
      
      * Switch to use_fast = True by default
      
      * update tests for use_fast at True by default
      
      * fix rag tokenizer test
      
      * removing protobuf from required dependencies
      
      * fix NER test for use_fast = True by default
      
      * fixing example tests (Q&A examples use slow tokenizers for now)
      
      * protobuf in main deps extras["sentencepiece"] and example deps
      
      * fix protobug install test
      
      * try to fix seq2seq by switching to slow tokenizers for now
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      f4e04cd2
  11. 13 Nov, 2020 2 commits
  12. 12 Nov, 2020 1 commit
  13. 11 Nov, 2020 1 commit