1. 03 May, 2022 1 commit
    • Yih-Dar's avatar
      Move test model folders (#17034) · 19420fd9
      Yih-Dar authored
      
      
      * move test model folders (TODO: fix imports and others)
      
      * fix (potentially partially) imports (in model test modules)
      
      * fix (potentially partially) imports (in tokenization test modules)
      
      * fix (potentially partially) imports (in feature extraction test modules)
      
      * fix import utils.test_modeling_tf_core
      
      * fix path ../fixtures/
      
      * fix imports about generation.test_generation_flax_utils
      
      * fix more imports
      
      * fix fixture path
      
      * fix get_test_dir
      
      * update module_to_test_file
      
      * fix get_tests_dir from wrong transformers.utils
      
      * update config.yml (CircleCI)
      
      * fix style
      
      * remove missing imports
      
      * update new model script
      
      * update check_repo
      
      * update SPECIAL_MODULE_TO_TEST_MAP
      
      * fix style
      
      * add __init__
      
      * update self-scheduled
      
      * fix add_new_model scripts
      
      * check one way to get location back
      
      * python setup.py build install
      
      * fix import in test auto
      
      * update self-scheduled.yml
      
      * update slack notification script
      
      * Add comments about artifact names
      
      * fix for yolos
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      19420fd9
  2. 23 Feb, 2022 1 commit
  3. 18 Oct, 2021 1 commit
  4. 14 Jun, 2021 1 commit
  5. 31 Mar, 2021 1 commit
  6. 17 Nov, 2020 1 commit
    • Sylvain Gugger's avatar
      Reorganize repo (#8580) · c89bdfbe
      Sylvain Gugger authored
      * Put models in subfolders
      
      * Styling
      
      * Fix imports in tests
      
      * More fixes in test imports
      
      * Sneaky hidden imports
      
      * Fix imports in doc files
      
      * More sneaky imports
      
      * Finish fixing tests
      
      * Fix examples
      
      * Fix path for copies
      
      * More fixes for examples
      
      * Fix dummy files
      
      * More fixes for example
      
      * More model import fixes
      
      * Is this why you're unhappy GitHub?
      
      * Fix imports in conver command
      c89bdfbe
  7. 18 Sep, 2020 1 commit
    • Dat Quoc Nguyen's avatar
      Add new pre-trained models BERTweet and PhoBERT (#6129) · af2322c7
      Dat Quoc Nguyen authored
      * Add BERTweet and PhoBERT models
      
      * Update modeling_auto.py
      
      Re-add `bart` to LM_MAPPING
      
      * Update tokenization_auto.py
      
      Re-add `from .configuration_mobilebert import MobileBertConfig`
      not sure why it's replaced by `from transformers.configuration_mobilebert import MobileBertConfig`
      
      * Add BERTweet and PhoBERT to pretrained_models.rst
      
      * Update tokenization_auto.py
      
      Remove BertweetTokenizer and PhobertTokenizer out of tokenization_auto.py (they are currently not supported by AutoTokenizer.
      
      * Update BertweetTokenizer - without nltk
      
      * Update model card for BERTweet
      
      * PhoBERT - with Auto mode - without import fastBPE
      
      * PhoBERT - with Auto mode - without import fastBPE
      
      * BERTweet - with Auto mode - without import fastBPE
      
      * Add PhoBERT and BERTweet to TF modeling auto
      
      * Improve Docstrings for PhobertTokenizer and BertweetTokenizer
      
      * Update PhoBERT and BERTweet model cards
      
      * Fixed a merge conflict in tokenization_auto
      
      * Used black to reformat BERTweet- and PhoBERT-related files
      
      * Used isort to reformat BERTweet- and PhoBERT-related files
      
      * Reformatted BERTweet- and PhoBERT-related files based on flake8
      
      * Updated test files
      
      * Updated test files
      
      * Updated tf test files
      
      * Updated tf test files
      
      * Updated tf test files
      
      * Updated tf test files
      
      * Update commits from huggingface
      
      * Delete unnecessary files
      
      * Add tokenizers to auto and init files
      
      * Add test files for tokenizers
      
      * Revised model cards
      
      * Update save_vocabulary function in BertweetTokenizer and PhobertTokenizer and test files
      
      * Revised test files
      
      * Update orders of Phobert and Bertweet tokenizers in auto tokenization file
      af2322c7
  8. 15 Jun, 2020 1 commit
    • Anthony MOI's avatar
      [HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized... · 36434220
      Anthony MOI authored
      
      [HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510)
      
      * Use tokenizers pre-tokenized pipeline
      
      * failing pretrokenized test
      
      * Fix is_pretokenized in python
      
      * add pretokenized tests
      
      * style and quality
      
      * better tests for batched pretokenized inputs
      
      * tokenizers clean up - new padding_strategy - split the files
      
      * [HUGE] refactoring tokenizers - padding - truncation - tests
      
      * style and quality
      
      * bump up requied tokenizers version to 0.8.0-rc1
      
      * switched padding/truncation API - simpler better backward compat
      
      * updating tests for custom tokenizers
      
      * style and quality - tests on pad
      
      * fix QA pipeline
      
      * fix backward compatibility for max_length only
      
      * style and quality
      
      * Various cleans up - add verbose
      
      * fix tests
      
      * update docstrings
      
      * Fix tests
      
      * Docs reformatted
      
      * __call__ method documented
      Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      36434220
  9. 15 Jan, 2020 1 commit
  10. 06 Jan, 2020 2 commits
  11. 22 Dec, 2019 8 commits
  12. 21 Dec, 2019 1 commit
    • Aymeric Augustin's avatar
      Reformat source code with black. · fa84ae26
      Aymeric Augustin authored
      This is the result of:
      
          $ black --line-length 119 examples templates transformers utils hubconf.py setup.py
      
      There's a lot of fairly long lines in the project. As a consequence, I'm
      picking the longest widely accepted line length, 119 characters.
      
      This is also Thomas' preference, because it allows for explicit variable
      names, to make the code easier to understand.
      fa84ae26
  13. 08 Oct, 2019 1 commit
  14. 04 Oct, 2019 1 commit
    • keskarnitish's avatar
      Adding CTRL (squashed commit) · dbed1c5d
      keskarnitish authored
      adding conversion script
      
      adding first draft of modeling & tokenization
      
      adding placeholder for test files
      
      bunch of changes
      
      registering the tokenizer/model/etc
      
      tests
      
      change link; something is very VERY wrong here
      
      weird end-of-word thingy going on
      
      i think the tokenization works now ; wrote the unit tests
      
      overall structure works;load w next
      
      the monster is alive!
      
      works after some cleanup as well
      
      adding emacs autosave to gitignore
      
      currently only supporting the 48 layer one; seems to infer fine on my macbook
      
      cleanup
      
      fixing some documentation
      
      fixing some documentation
      
      tests passing?
      
      now works on CUDA also
      
      adding greedy?
      
      adding greedy sampling
      
      works well
      dbed1c5d
  15. 26 Sep, 2019 2 commits
  16. 30 Aug, 2019 5 commits
  17. 05 Aug, 2019 1 commit
  18. 15 Jul, 2019 1 commit
  19. 09 Jul, 2019 2 commits
  20. 05 Jul, 2019 3 commits
  21. 02 Jul, 2019 1 commit
  22. 17 Apr, 2019 3 commits