Fix the fp8 gemm for large tensors on MI300. (#1011)
* Fix the fp8 conversion
* Try clipping value before conversion
* Fix return
* Simplify with a const
* reduce the gemm input tensor values to reduce round-off error
* replace if-else with lambda
* fix syntax
---------
Co-authored-by:
Rostyslav Geyyer <rosty.geyyer@amd.com>
Showing
Please register or sign in to comment