Add llmcompressor fp8 kv-cache quant (per-tensor and per-attn_head) (#30141)
Signed-off-by:Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com> Signed-off-by:
eldarkurtic <8884008+eldarkurtic@users.noreply.github.com>
Showing
Please register or sign in to comment