• Sanchit Gandhi's avatar
    [Whisper Tokenizer] Encode timestamps (#26054) · ac957f69
    Sanchit Gandhi authored
    * [Whisper Tokenizer] Fix tests after adding timestamps
    
    * fix s2t tokenizer tests
    
    * fix vocab test
    
    * backwards comp
    
    * fix tests
    
    * comment
    
    * style
    
    * fix last test
    
    * fix fast
    
    * make faster
    
    * move logic to decode
    
    * remove skip test
    
    * fix decode with offsets
    
    * fix special tokens
    
    * empty commit to re-trigger ci
    
    * use lru cache
    ac957f69
test_tokenization_whisper.py 21.8 KB