Fix incorrect cache allocation with multi-query (#2203)
We wouldn't allocate any memory in multi-query (1 KV head). Fixes Starcoder et al.
Showing
Please register or sign in to comment
We wouldn't allocate any memory in multi-query (1 KV head). Fixes Starcoder et al.