[Attention Backend] TurboQuant: 2-bit KV cache compression with 4x capacity (#38479)
Signed-off-by:vibhavagarwal5 <vibhavagarwal5@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Xinyu Chen <xinyu1.chen@intel.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com>
Showing
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment