1. 07 Dec, 2023 1 commit
  2. 15 Nov, 2023 1 commit
    • shivadbhavsar's avatar
      Support per-axis quantization (#2390) · 0039b11a
      shivadbhavsar authored
      Reworked the simplify_qdq pass to support:
      
      Per-axis quantization (ie. allow 1D scales and zero points)
      Allow broadcast and transpose ops between dq and quant_op
      0039b11a
  3. 22 Jun, 2022 1 commit
  4. 11 May, 2022 1 commit
  5. 08 Oct, 2021 1 commit
    • Umang Yadav's avatar
      Remove alpha and beta from `dot` and `quant_dot` (#961) · 21193e87
      Umang Yadav authored
      Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.
      
      Aim is to have the definition of dot operator as C = A . B without having alpha or beta.
      
      In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.
      21193e87
  6. 17 Sep, 2021 2 commits
    • Paul Fultz II's avatar
      985f58b0
    • Umang Yadav's avatar
      Remove alpha and beta attributes from dot operator (#945) · 9e43cb8b
      Umang Yadav authored
      This PR aims to remove alpha and beta attributes from dot operator completely.
      
      Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.
      
      Aim is to have the definition of dot operator as C = A . B without having alpha or beta.
      
      In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.
      9e43cb8b
  7. 07 Sep, 2021 1 commit
    • Shucai Xiao's avatar
      qdq for quantization and include subgraph (#891) · b45f7239
      Shucai Xiao authored
      
      
      Add operators, refactor parsers, add rewrite passes, add tests
      Add ref implementations
      Move broadcasting of scales and zero points to onnx parser
      Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type
      fp16 and fp8 quantization to include subgraph and parameters
      fix unit test to use qdq operators for int8 quantization
      Co-authored-by: default avatarturneram <alturner@amd.com>
      b45f7239
  8. 24 Aug, 2021 1 commit
    • Umang Yadav's avatar
      Change attributes names to be more consistent and reflect better meaning (#916) · 0d2606bb
      Umang Yadav authored
      * rename broadcast and multibroadcast output_lens attribute to out_lens attribute, and change tests and source code to reflect the same
      
      * change the reshape attribute from dims to out_lens
      
      * change transpose attribute's name from dims to perm to reflect better meaning
      
      * use permutation instead of perm for transpose
      
      clang formaating
      
      * use dims instead of out_lens for reshape
      
      clang formatting
      0d2606bb
  9. 18 Aug, 2021 1 commit
    • turneram's avatar
      Optimize Q/DQ Format Pass (#889) · 0b5f33b6
      turneram authored
      * Add operators, refactor parsers, add rewrite passes, add tests
      
      * Add ref implementations
      
      * Move broadcasting of scales and zero points to onnx parser
      
      * Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type
      
      * Switch certain variables to int64_t
      
      * Fix overflow in implicit constant conversion
      
      * Remove operators.hpp from includes in tf_test.cpp
      
      * Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes
      
      * Switch dequantizelinear math from int32 to float
      
      * Remove changes to operators.hpp
      
      * Simplify apply_quantizelinear
      
      * Add verify test for int32 data
      
      * Add rewrite_quantization back to CMakeLists
      
      * Add passes to insert qdq after add_bias is applied, replace quant_ops, and remove remaining qdq pairs
      
      * Renaming, refactoring, cleaning up code, adding formal test, and adding passes to targets
      
      * Renaming, review comments, begin adding more specific tests
      
      * Add more specific unit tests
      
      * Fix failing test on CI
      
      * Correct matcher and update qop rewriting, update tests and add more tests
      
      * Update matcher, clean up simplify_qdq, tweak tests
      
      * Add tests, remove pass from CPU target, update dot parameters, clean up simplify_qdq
      
      * Fix correctness bug in ref q/dq implementations; edit gemm parser to make beta always 0.0
      
      * Remove unused variables in onnx gemm tests
      0b5f33b6