"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "74a3cebfa51b539bfcfa79b33686cc090b7074e8"
Offloaded KV Cache (#31325)
* Initial implementation of OffloadedCache * enable usage via cache_implementation * Address feedback, add tests, remove legacy methods. * Remove flash-attn, discover synchronization bugs, fix bugs * Prevent usage in CPU only mode * Add a section about offloaded KV cache to the docs * Fix typos in docs * Clarifications and better explanation of streams
Showing
Please register or sign in to comment