add warning when FP8 KV cache misses prefill query quantization (#39752)
Signed-off-by:Michael Goin <mgoin64@gmail.com> Co-authored-by:
Albert Cheng (Engrg-Hardware 1) <albecheng@login-lyris02.lyris.clusters.nvidia.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Showing
Please register or sign in to comment