[Feature] KV cache per-token-head INT8/FP8 quantization (#38378)
Signed-off-by:JartX <sagformas@epdcenter.es> Signed-off-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by:
yangyang4991 <yangyang4991@gmail.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by:
Isotr0py <2037008807@qq.com>
Showing
Please register or sign in to comment