[Quantization][FP8] Add support for FP8 models with input_scale for output...
[Quantization][FP8] Add support for FP8 models with input_scale for output projection and QK quantization (#15734) Signed-off-by:Randall Smith <Randall.Smith@amd.com> Signed-off-by:
Luka Govedič <lgovedic@redhat.com> Co-authored-by:
Luka Govedič <lgovedic@redhat.com>
Showing
Please register or sign in to comment