1. 18 Oct, 2023 1 commit
    • Arthur's avatar
      [`Tokenizer`] Fix slow and fast serialization (#26570) · ef7e9369
      Arthur authored
      * fix
      
      * last attempt
      
      * current work
      
      * fix forward compatibility
      
      * save all special tokens
      
      * current state
      
      * revert additional changes
      
      * updates
      
      * remove tokenizer.model
      
      * add a test and the fix
      
      * nit
      
      * revert one more break
      
      * fix typefield issue
      
      * quality
      
      * more tests
      
      * fix fields for FC
      
      * more nits?
      
      * new additional changes
      
      * how
      
      * some updates
      
      * simplify all
      
      * more nits
      
      * revert some things to original
      
      * nice
      
      * nits
      
      * a small hack
      
      * more nits
      
      * ahhaha
      
      * fixup
      
      * update
      
      * make test run on ci
      
      * use subtesting
      
      * update
      
      * Update .circleci/create_circleci_config.py
      
      * updates
      
      * fixup
      
      * nits
      
      * replace typo
      
      * fix the test
      
      * nits
      
      * update
      
      * None max dif pls
      
      * a partial fix
      
      * had to revert one thing
      
      * test the fast
      
      * updates
      
      * fixup
      
      * and more nits
      
      * more fixes
      
      * update
      
      * Oupsy 馃憗
      
      
      
      * nits
      
      * fix marian
      
      * on our way to heaven
      
      * Update src/transformers/models/t5/tokenization_t5.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * fixup
      
      * Update src/transformers/tokenization_utils_fast.py
      Co-authored-by: default avatarLeo Tronchon <leo.tronchon@gmail.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLeo Tronchon <leo.tronchon@gmail.com>
      
      * fix phobert
      
      * skip some things, test more
      
      * nits
      
      * fixup
      
      * fix deberta
      
      * update
      
      * update
      
      * more updates
      
      * skip one test
      
      * more updates
      
      * fix camembert
      
      * can't test this one
      
      * more good fixes
      
      * kind of a major update
      
      - seperate what is only done in fast in fast init and refactor
      - add_token(AddedToken(..., speicla = True)) ignores it in fast
      - better loading
      
      * fixup
      
      * more fixups
      
      * fix pegasus and mpnet
      
      * remove skipped tests
      
      * fix phoneme tokenizer if self.verbose
      
      * fix individual models
      
      * update common tests
      
      * update testing files
      
      * all over again
      
      * nits
      
      * skip test for markup lm
      
      * fixups
      
      * fix order of addition in fast by sorting the added tokens decoder
      
      * proper defaults for deberta
      
      * correct default for fnet
      
      * nits on add tokens, string initialized to special if special
      
      * skip irrelevant herbert tests
      
      * main fixes
      
      * update test added_tokens_serialization
      
      * the fix for bart like models and class instanciating
      
      * update bart
      
      * nit!
      
      * update idefix test
      
      * fix whisper!
      
      * some fixup
      
      * fixups
      
      * revert some of the wrong chanegs
      
      * fixup
      
      * fixup
      
      * skip marian
      
      * skip the correct tests
      
      * skip for tf and flax as well
      
      ---------
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      Co-authored-by: default avatarLeo Tronchon <leo.tronchon@gmail.com>
      ef7e9369
  2. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  3. 03 May, 2022 1 commit
    • Yih-Dar's avatar
      Move test model folders (#17034) · 19420fd9
      Yih-Dar authored
      
      
      * move test model folders (TODO: fix imports and others)
      
      * fix (potentially partially) imports (in model test modules)
      
      * fix (potentially partially) imports (in tokenization test modules)
      
      * fix (potentially partially) imports (in feature extraction test modules)
      
      * fix import utils.test_modeling_tf_core
      
      * fix path ../fixtures/
      
      * fix imports about generation.test_generation_flax_utils
      
      * fix more imports
      
      * fix fixture path
      
      * fix get_test_dir
      
      * update module_to_test_file
      
      * fix get_tests_dir from wrong transformers.utils
      
      * update config.yml (CircleCI)
      
      * fix style
      
      * remove missing imports
      
      * update new model script
      
      * update check_repo
      
      * update SPECIAL_MODULE_TO_TEST_MAP
      
      * fix style
      
      * add __init__
      
      * update self-scheduled
      
      * fix add_new_model scripts
      
      * check one way to get location back
      
      * python setup.py build install
      
      * fix import in test auto
      
      * update self-scheduled.yml
      
      * update slack notification script
      
      * Add comments about artifact names
      
      * fix for yolos
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      19420fd9