src/quantize_int8.cpp · b45f72396189fab6e26c65b4669f0ce82194122a · gaoqiong / MIGraphX

qdq for quantization and include subgraph (#891) · b45f7239

Shucai Xiao authored Sep 07, 2021



Add operators, refactor parsers, add rewrite passes, add tests
Add ref implementations
Move broadcasting of scales and zero points to onnx parser
Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type
fp16 and fp8 quantization to include subgraph and parameters
fix unit test to use qdq operators for int8 quantization
Co-authored-by: turneram <alturner@amd.com>

b45f7239

quantize_int8.cpp 2.83 KB

Replace quantize_int8.cpp