Ref implementation of FP8 (#2438)
Handles all 4 Fp8 dtypes listed here : https://onnx.ai/onnx/technical/float8.html Follows saturation/clipping logic from table there as well : https://onnx.ai/onnx/technical/float8.html#cast Only adding fp8e4m3fnuz in MIGraphX IR for now.
Showing
test/fp8e4m3fn.cpp
0 → 100644
test/fp8e4m3fnuz.cpp
0 → 100644
test/fp8e5m2.cpp
0 → 100644
test/fp8e5m2fnuz.cpp
0 → 100644
Please register or sign in to comment