"example/vscode:/vscode.git/clone" did not exist on "9c54eaab04e6db605dc86f1d1ab16bd04f51fc89"
Offloaded KV Cache (#31325)
* Initial implementation of OffloadedCache * enable usage via cache_implementation * Address feedback, add tests, remove legacy methods. * Remove flash-attn, discover synchronization bugs, fix bugs * Prevent usage in CPU only mode * Add a section about offloaded KV cache to the docs * Fix typos in docs * Clarifications and better explanation of streams
Showing
Please register or sign in to comment