src/targets/gpu/target.cpp · 0b5f33b629ff2435fdfb53279791fe9a92ecdb50 · gaoqiong / MIGraphX

Optimize Q/DQ Format Pass (#889) · 0b5f33b6

turneram authored Aug 18, 2021

* Add operators, refactor parsers, add rewrite passes, add tests

* Add ref implementations

* Move broadcasting of scales and zero points to onnx parser

* Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type

* Switch certain variables to int64_t

* Fix overflow in implicit constant conversion

* Remove operators.hpp from includes in tf_test.cpp

* Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes

* Switch dequantizelinear math from int32 to float

* Remove changes to operators.hpp

* Simplify apply_quantizelinear

* Add verify test for int32 data

* Add rewrite_quantization back to CMakeLists

* Add passes to insert qdq after add_bias is applied, replace quant_ops, and remove remaining qdq pairs

* Renaming, refactoring, cleaning up code, adding formal test, and adding passes to targets

* Renaming, review comments, begin adding more specific tests

* Add more specific unit tests

* Fix failing test on CI

* Correct matcher and update qop rewriting, update tests and add more tests

* Update matcher, clean up simplify_qdq, tweak tests

* Add tests, remove pass from CPU target, update dot parameters, clean up simplify_qdq

* Fix correctness bug in ref q/dq implementations; edit gemm parser to make beta always 0.0

* Remove unused variables in onnx gemm tests

0b5f33b6

target.cpp 4.64 KB

Replace target.cpp