• Sanchit Gandhi's avatar
    [whisper] static kv cache (#31166) · a9701953
    Sanchit Gandhi authored
    
    
    * make work with cache abstraction
    
    * correct for static cache
    
    * hacks for compile
    
    * make fast
    
    * fix
    
    * fix pos ids
    
    * generate
    
    * fix sdpa
    
    * fix sdpa cache pos
    
    * fix fa2
    
    * clean fa2
    
    * integrate cache into generate
    
    * make style
    
    * copies
    
    * more copies
    
    * update eager
    
    * update sdpa
    
    * update fa2
    
    * simplify
    
    * use cache pos
    
    * always compute cross-cache for debug
    
    * avoid recompiles
    Co-authored-by: default avatarArthur Zucker <arthur@huggingface.co>
    
    * fix fix
    
    * fix fix fix
    
    * more fix
    
    * try encoder-decoder cache (too messy)
    
    * revert encoder-decoder cache
    
    * check cross-attn cache
    
    * use enc-dec dataclass
    
    * use richer enc-dec dataclass
    
    * clean-up
    
    * revert static cache changes
    
    * small fixes
    
    * revert to cpu flag
    
    * fix copies
    
    * add static slow test
    
    * past k/v docstring
    
    * more docstrings
    
    * cache_position docstrings
    
    * add to docs
    
    * add enc-dec cache to docs
    
    * make style
    
    * fix after rebase
    
    * fix beam
    
    * style
    
    * fix generation strategies
    
    * fix most decoder-only tests
    
    * style
    
    * skip test
    
    * more clean up
    
    * small docstrings
    
    * Apply suggestions from code review
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * add todo
    
    * only crop self-attn
    
    * check cache in mixin
    
    * style
    
    * fix re-compile after rebase
    
    * move `is_updated` logic to enc-dec wrapper
    
    * revert back
    
    * revert cache back
    
    * finalise design
    
    * fix
    
    * fix fix
    
    * style
    
    * Update src/transformers/cache_utils.py
    Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * deprecate
    
    * updates
    
    * final updates
    
    * style
    
    * style
    
    ---------
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
    a9701953
generation_utils.md 9.19 KB