• Arthur's avatar
    [`split_special_tokens`] Add support for `split_special_tokens` argument to encode (#25081) · 30b3c46f
    Arthur authored
    * draft changes
    
    * update and add tests
    
    * styling for no
    
    * move test
    
    * path to usable model
    
    * update test
    
    * small update
    
    * update bertbased tokenizers
    
    * don'tuse kwargs for _tokenize
    
    * don'tuse kwargs for _tokenize
    
    * fix copies
    
    * update
    
    * update test for special tokenizers
    
    * fixup
    
    * skip two tests
    
    * remove pdb breakpiont()
    
    * wowo
    
    * rewrite custom tests
    
    * nits
    
    * revert chang in target keys
    
    * fix markup lm
    
    * update documentation of the argument
    30b3c46f
test_tokenization_layoutlmv3.py 123 KB