• SaulLu's avatar
    fix the `tokenizer_config.json` file for the slow tokenizer when a fast... · 7b8bdd86
    SaulLu authored
    fix the `tokenizer_config.json` file for the slow tokenizer when a fast version is available (#15319)
    
    * add new test
    
    * update test
    
    * remove `tokenizer_file` from `additional_files_names` in `tokenization_utils_base.py`
    
    * add `tokenizer_file` for the fast only tokenizer
    
    * change global variables layoutxml
    
    * remove `"tokenizer_file"` from DPR tokenizer's Global variables
    
    * remove `tokenizer_file` from herbert slow tokenizer init
    
    * `"tokenizer_file"` from LED tokenizer's Global variables
    
    * remove `tokenizer_file` from mbart slow tokenizer init
    
    * remove `tokenizer_file` from slow tokenizer template
    
    * adapt to versioning
    
    * adapt the `test_tokenizer_mismatch_warning` test
    
    * clean test
    
    * clarify `VOCAB_FILES_NAMES` in tokenization_utils_fast.py
    
    * Revert "remove `tokenizer_file` from mbart slow tokenizer init"
    
    This reverts commit 0dbb723fa9c7599d4640fe30b3647a74eb4a64e1.
    
    * Revert "`"tokenizer_file"` from LED tokenizer's Global variables"
    
    This reverts commit 5a3f879bdd651233f3d74a3d1146c34cde82b0c2.
    
    * Revert "remove `tokenizer_file` from herbert slow tokenizer init"
    
    This reverts commit f5e10007b7b0ec5345e015b9de7ffec72c5407fd.
    
    * Revert "remove `"tokenizer_file"` from DPR tokenizer's Global variables"
    
    This reverts commit da0895330bedfafc81ae3073470a9348c669f032.
    
    * set `tokenizer_file` in super `__init__` of mbart
    7b8bdd86
test_tokenization_common.py 194 KB