1. 13 Mar, 2024 1 commit
  2. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  3. 03 May, 2022 1 commit
    • Yih-Dar's avatar
      Move test model folders (#17034) · 19420fd9
      Yih-Dar authored
      
      
      * move test model folders (TODO: fix imports and others)
      
      * fix (potentially partially) imports (in model test modules)
      
      * fix (potentially partially) imports (in tokenization test modules)
      
      * fix (potentially partially) imports (in feature extraction test modules)
      
      * fix import utils.test_modeling_tf_core
      
      * fix path ../fixtures/
      
      * fix imports about generation.test_generation_flax_utils
      
      * fix more imports
      
      * fix fixture path
      
      * fix get_test_dir
      
      * update module_to_test_file
      
      * fix get_tests_dir from wrong transformers.utils
      
      * update config.yml (CircleCI)
      
      * fix style
      
      * remove missing imports
      
      * update new model script
      
      * update check_repo
      
      * update SPECIAL_MODULE_TO_TEST_MAP
      
      * fix style
      
      * add __init__
      
      * update self-scheduled
      
      * fix add_new_model scripts
      
      * check one way to get location back
      
      * python setup.py build install
      
      * fix import in test auto
      
      * update self-scheduled.yml
      
      * update slack notification script
      
      * Add comments about artifact names
      
      * fix for yolos
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      19420fd9
  4. 23 Feb, 2022 1 commit
  5. 18 Oct, 2020 1 commit
    • Thomas Wolf's avatar
      [Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a
      Thomas Wolf authored
      * splitting fast and slow tokenizers [WIP]
      
      * [WIP] splitting sentencepiece and tokenizers dependencies
      
      * update dummy objects
      
      * add name_or_path to models and tokenizers
      
      * prefix added to file names
      
      * prefix
      
      * styling + quality
      
      * spliting all the tokenizer files - sorting sentencepiece based ones
      
      * update tokenizer version up to 0.9.0
      
      * remove hard dependency on sentencepiece 馃帀
      
      * and removed hard dependency on tokenizers 馃帀
      
      
      
      * update conversion script
      
      * update missing models
      
      * fixing tests
      
      * move test_tokenization_fast to main tokenization tests - fix bugs
      
      * bump up tokenizers
      
      * fix bert_generation
      
      * update ad fix several tokenizers
      
      * keep sentencepiece in deps for now
      
      * fix funnel and deberta tests
      
      * fix fsmt
      
      * fix marian tests
      
      * fix layoutlm
      
      * fix squeezebert and gpt2
      
      * fix T5 tokenization
      
      * fix xlnet tests
      
      * style
      
      * fix mbart
      
      * bump up tokenizers to 0.9.2
      
      * fix model tests
      
      * fix tf models
      
      * fix seq2seq examples
      
      * fix tests without sentencepiece
      
      * fix slow => fast  conversion without sentencepiece
      
      * update auto and bert generation tests
      
      * fix mbart tests
      
      * fix auto and common test without tokenizers
      
      * fix tests without tokenizers
      
      * clean up tests lighten up when tokenizers + sentencepiece are both off
      
      * style quality and tests fixing
      
      * add sentencepiece to doc/examples reqs
      
      * leave sentencepiece on for now
      
      * style quality split hebert and fix pegasus
      
      * WIP Herbert fast
      
      * add sample_text_no_unicode and fix hebert tokenization
      
      * skip FSMT example test for now
      
      * fix style
      
      * fix fsmt in example tests
      
      * update following Lysandre and Sylvain's comments
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/testing_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ba8c4d0a
  6. 05 Oct, 2020 1 commit
    • Forrest Iandola's avatar
      SqueezeBERT architecture (#7083) · 02ef825b
      Forrest Iandola authored
      * configuration_squeezebert.py
      
      thin wrapper around bert tokenizer
      
      fix typos
      
      wip sb model code
      
      wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working
      
      set up squeezebert to use BertModelOutput when returning results.
      
      squeezebert documentation
      
      formatting
      
      allow head mask that is an array of [None, ..., None]
      
      docs
      
      docs cont'd
      
      path to vocab
      
      docs and pointers to cloud files (WIP)
      
      line length and indentation
      
      squeezebert model cards
      
      formatting of model cards
      
      untrack modeling_squeezebert_scratchpad.py
      
      update aws paths to vocab and config files
      
      get rid of stub of NSP code, and advise users to pretrain with mlm only
      
      fix rebase issues
      
      redo rebase of modeling_auto.py
      
      fix issues with code formatting
      
      more code format auto-fixes
      
      move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert
      
      tests for squeezebert modeling and tokenization
      
      fix typo
      
      move squeezebert before bert in modeling_auto.py to fix inheritance problem
      
      disable test_head_masking, since squeezebert doesn't yet implement head masking
      
      fix issues exposed by the test_modeling_squeezebert.py
      
      fix an issue exposed by test_tokenization_squeezebert.py
      
      fix issue exposed by test_modeling_squeezebert.py
      
      auto generated code style improvement
      
      issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()
      
      update copyright
      
      resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask
      
      docs
      
      add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli
      
      autogenerated formatting tweaks
      
      integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings
      
      * tiny change to order of imports
      02ef825b
  7. 01 Jul, 2020 1 commit
  8. 19 May, 2020 1 commit
  9. 08 Apr, 2020 1 commit
  10. 06 Jan, 2020 2 commits
  11. 22 Dec, 2019 7 commits
  12. 21 Dec, 2019 1 commit
    • Aymeric Augustin's avatar
      Reformat source code with black. · fa84ae26
      Aymeric Augustin authored
      This is the result of:
      
          $ black --line-length 119 examples templates transformers utils hubconf.py setup.py
      
      There's a lot of fairly long lines in the project. As a consequence, I'm
      picking the longest widely accepted line length, 119 characters.
      
      This is also Thomas' preference, because it allows for explicit variable
      names, to make the code easier to understand.
      fa84ae26
  13. 06 Dec, 2019 1 commit
    • Aymeric Augustin's avatar
      Remove dependency on pytest for running tests (#2055) · 35401fe5
      Aymeric Augustin authored
      * Switch to plain unittest for skipping slow tests.
      
      Add a RUN_SLOW environment variable for running them.
      
      * Switch to plain unittest for PyTorch dependency.
      
      * Switch to plain unittest for TensorFlow dependency.
      
      * Avoid leaking open files in the test suite.
      
      This prevents spurious warnings when running tests.
      
      * Fix unicode warning on Python 2 when running tests.
      
      The warning was:
      
          UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
      
      * Support running PyTorch tests on a GPU.
      
      Reverts 27e015bd.
      
      * Tests no longer require pytest.
      
      * Make tests pass on cuda
      35401fe5
  14. 04 Nov, 2019 1 commit
  15. 22 Oct, 2019 1 commit
  16. 04 Oct, 2019 1 commit
  17. 26 Sep, 2019 1 commit
  18. 24 Sep, 2019 1 commit
  19. 19 Sep, 2019 2 commits
  20. 30 Aug, 2019 1 commit
  21. 28 Aug, 2019 2 commits