1. 02 Oct, 2023 1 commit
  2. 21 Sep, 2023 1 commit
  3. 07 Sep, 2023 1 commit
  4. 19 Jul, 2023 1 commit
    • Paul Fultz II's avatar
      Simplify nested reshapes (#1932) · abb3efc1
      Paul Fultz II authored
      The find_reshaper is supposed to do this, but it doesnt work and there were no tests. So I updated for it to work and I added unit tests for it.
      abb3efc1
  5. 04 Apr, 2023 1 commit
    • shivadbhavsar's avatar
      fix bug in transpose_slice simplification (#1660) · 30af1697
      shivadbhavsar authored
      Bug found due to failing torch benchmark. Added test case to reproduce issue causing the model to error out on compile.
      Original logic results in the following error:
      AMDMIGraphX/src/include/migraphx/op/unsqueeze.hpp:128: normalize_compute_shape: UNSQUEEZE: Axis dimenstion is not divisible by step
      30af1697
  6. 13 Jan, 2023 1 commit
  7. 13 Sep, 2022 1 commit
    • turneram's avatar
      Use rocblas_gemm_ex for batched gemms with broadcasted B (#1354) · a10a8ef1
      turneram authored
      Improves performance for 4/6 GEMMs used by huggingface BERT models with batch_size>1 by using a non-batched rocBLAS call for GEMMs where the B input has a broadcasted batch dimension.
      The four verify tests added reflect the actual configurations used by bert-base-cased, with varied batch sizes.
      
      Also adds a matcher to simplify_reshapes to move multibroadcasts after concats.
      a10a8ef1
  8. 06 Sep, 2022 1 commit
  9. 17 Aug, 2022 1 commit
  10. 07 Jul, 2022 1 commit
    • Paul Fultz II's avatar
      Add a step to unsqeeze axis (#1242) · bd503d89
      Paul Fultz II authored
      Instead of just unsqueezing to an axis of 1 a step can be set to use instead. So instead of unsqueezing {3, 12} to {3, 1, 12} a step of 2 will unsqeeze to {3, 2, 6} instead
      bd503d89
  11. 05 Jul, 2022 1 commit
  12. 22 Jun, 2022 1 commit
  13. 17 May, 2022 1 commit
  14. 11 May, 2022 1 commit
  15. 02 Mar, 2022 1 commit
  16. 03 Nov, 2021 1 commit
    • Umang Yadav's avatar
      Add tests for the DepthToSpace+Binary pointwise operations fusion (#987) · eb6abd27
      Umang Yadav authored
      In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.
      
      If there is trailing binary pointwise operator after depthToSpace then, migraphx can move binary operator before contiguous and reshape of the depthtospce.
      
      So, it becomes reshape-->transpose-->binary_op-->contiguous-->reshape.
      
      Explicit contiguous wouldn't be required since binary_op outputs standard shape. So, it becomes reshape-->transpose-->binary-->reshape.
      
      simplify_reshapes already has matcher that can do this transformation. This PR adds test for cases like depthtospace +binary op.
      
      solves #905
      eb6abd27
  17. 28 Oct, 2021 1 commit
    • Umang Yadav's avatar
      DepthToSpace and pointwise unary operations fusion (#986) · cf0b6d6d
      Umang Yadav authored
      In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.
      
      This PR adds matcher to find d2s + unary pointwise ops.
      
      Application of the matcher moves the pointwise unary operation before the contiguous and reshape of the d2s.
      So it becomes
      reshape --> transpose --> unary --> contiguous --> reshape.
      
      Motivation is that, later pointwise module would be created out of unary --> contiguous --> reshape. Codegen for this pointwise module can write out buffer such that explicit contiguous and reshape wouldn't be required.
      
      This transformation is not always guaranteed to improve performance, since unary op will operate on non-standard shape. So, we would need some tuning mechanism to make decision.
      
      #905 pending PR for binary operations.
      cf0b6d6d
  18. 24 Aug, 2021 1 commit
    • Umang Yadav's avatar
      Change attributes names to be more consistent and reflect better meaning (#916) · 0d2606bb
      Umang Yadav authored
      * rename broadcast and multibroadcast output_lens attribute to out_lens attribute, and change tests and source code to reflect the same
      
      * change the reshape attribute from dims to out_lens
      
      * change transpose attribute's name from dims to perm to reflect better meaning
      
      * use permutation instead of perm for transpose
      
      clang formaating
      
      * use dims instead of out_lens for reshape
      
      clang formatting
      0d2606bb
  19. 23 May, 2021 1 commit
  20. 23 Apr, 2021 1 commit
    • Shucai Xiao's avatar
      Optimize resize and where operators (#784) · 17485202
      Shucai Xiao authored
      
      
      * code backup
      
      * clang format
      
      * add a matcher related to the special resize case for optimization
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * remove unnecessary code
      
      * add optimization for the where op
      
      * clang format
      
      * fix cppcheck error
      
      * add a unit test for optimize resize
      
      * clang format
      
      * remove unnecessary header include
      
      * code backup
      
      * clang format
      
      * add unit tests for optimizing resize
      
      * clang format
      
      * add more unit test for optimizing where op
      
      * clang format
      
      * remove unnecessary code
      
      * add one more optimzation to remove contiguous
      
      * clang format
      
      * add a pointwise requirement
      
      * clang format
      
      * fix cppcheck error
      
      * add one more unit test
      
      * fixed a bug
      
      * clang format
      
      * remove unnecessary code
      
      * clang format
      
      * fix a build error
      
      * fix review comments
      
      * clang format
      
      * fix a review comments
      
      * clang format
      
      * code refinement
      
      * clang format
      
      * refine more code
      
      * refine more code
      
      * fix a bug related to reshape_cont optimization
      
      * clang format
      
      * fix a review comment
      
      * removed an unnecessary comment
      
      * refine code according to comments
      
      * clang format
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      17485202
  21. 08 Feb, 2021 1 commit
    • Paul Fultz II's avatar
      Add a pass to remove unsupported data types (#738) · 3d24a21c
      Paul Fultz II authored
      
      
      * Add eliminate_data_type pass
      
      * Formatting
      
      * Auto convert quant ops
      
      * Formatting
      
      * Flip the order of decompose
      
      * Compute max size differently
      
      * Formatting
      
      * Clamp values in convert
      
      * Formatting
      
      * Fix loss of precision in reduce
      
      * Formatting
      
      * Fix bugs in reduction
      
      * Fix accumulator type in reference softmax implementation
      
      * Formatting
      
      * Update convert test
      
      * Remove unused variables
      
      * Remove unnecessary quant_dot check
      
      * Formatting
      
      * Add tests
      
      * Formatting
      
      * Remove unused code
      
      * Remove duplicate ops
      
      * Remove blaze dependency
      
      * Use set since shape::type_t is no hashable on gcc 5
      
      * Formatting
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      3d24a21c
  22. 18 Jan, 2021 1 commit
    • kahmed10's avatar
      Refactor to use tune_axis function (#713) · 651ea160
      kahmed10 authored
      * initial testing
      
      * initial testing
      
      * add dequantize
      
      * formatting
      
      * add tests
      
      * formatting
      
      * revert file
      
      * add parse files
      
      * formatting
      
      * add axis tuning and fix tests
      
      * formatting
      
      * add tests and fix int8
      
      * formatting
      
      * fix tidy
      
      * test with int32
      
      * add default name and change string to upper
      
      * formatting
      
      * remove boost call
      
      * refactor to use tune_axis)
      
      * formatting
      651ea160
  23. 08 Dec, 2020 1 commit
    • Paul Fultz II's avatar
      Refactor to use make_op almost everywhere (#696) · 8d21fdc9
      Paul Fultz II authored
      * Load op when serializing
      
      * Formatting
      
      * Add missing clip field
      
      * Use make_op almost everywhere
      
      * Formatting
      
      * More make ops for rnns
      
      * Get rid of spaces
      
      * Formatting
      
      * Remove operators headers
      
      * Formatting
      
      * Remove unused op headers
      
      * Increase line threshold
      8d21fdc9
  24. 11 Nov, 2020 1 commit
  25. 28 Oct, 2020 1 commit
  26. 21 Sep, 2020 1 commit
  27. 10 Jul, 2020 1 commit
  28. 16 Oct, 2019 1 commit
  29. 15 Oct, 2019 1 commit
  30. 03 Oct, 2019 1 commit
    • Paul Fultz II's avatar
      Improve contiguous and concat performance (#368) · 9b55685c
      Paul Fultz II authored
      * Add env to trace nary device functions
      
      * Formatting
      
      * Improve contiguous and concat performance
      
      * Formatting
      
      * Remove unused variable
      
      * Formatting
      
      * Fix gpu tests
      
      * Formatting
      
      * Add more test for transposed concat
      
      * Formatting
      
      * Compute offset and not index
      
      * Compute multi-index once
      
      * Formatting
      
      * Fix transposed inputs
      
      * Formatting
      
      * Use product order for comparisons of hip_array
      
      * Formatting
      
      * Add missing s parameter
      
      * Formatting
      
      * Dont invert permutation
      
      * Fix tidy warnings
      
      * Formatting
      
      * Remove incorrect license
      
      * Use a single integer for stride
      
      * Formatting
      
      * Fix tidy issue
      9b55685c
  31. 26 Sep, 2019 1 commit
  32. 15 Aug, 2019 2 commits
  33. 06 Jul, 2019 1 commit
  34. 02 Jul, 2019 6 commits