Commits · 210ea72d6b7f1ef16458f8a1087b783a97f84571 · gaoqiong / MIGraphX

09 Aug, 2023 1 commit

Add matcher for reshape_alias · 210ea72d

Ted Themistokleous authored Aug 03, 2023

use this to find a reshape->contiguous and then determine if aliasing can be done
thus use the proper reshape or reshape lazy operator

210ea72d

04 Apr, 2023 1 commit

fix bug in transpose_slice simplification (#1660) · 30af1697

shivadbhavsar authored Apr 04, 2023

Bug found due to failing torch benchmark. Added test case to reproduce issue causing the model to error out on compile.
Original logic results in the following error:
AMDMIGraphX/src/include/migraphx/op/unsqueeze.hpp:128: normalize_compute_shape: UNSQUEEZE: Axis dimenstion is not divisible by step

30af1697

13 Jan, 2023 1 commit
- Transpose slice fix (#1499) · 2c8149f6
  shivadbhavsar authored Jan 13, 2023
```
This PR resolves the bug addressed in #1496. 
```
  2c8149f6
13 Sep, 2022 1 commit

Use rocblas_gemm_ex for batched gemms with broadcasted B (#1354) · a10a8ef1

turneram authored Sep 13, 2022

Improves performance for 4/6 GEMMs used by huggingface BERT models with batch_size>1 by using a non-batched rocBLAS call for GEMMs where the B input has a broadcasted batch dimension.
The four verify tests added reflect the actual configurations used by bert-base-cased, with varied batch sizes.

Also adds a matcher to simplify_reshapes to move multibroadcasts after concats.

a10a8ef1

06 Sep, 2022 1 commit
- Enable cppcheck rule for 'not', 'or' keywords (#1361) · d37a4df9
  Paul Fultz II authored Sep 06, 2022
```
Using not and or improves readability. The cppcheck rule will help ensure we are doing it consistently.
```
  d37a4df9
17 Aug, 2022 1 commit
- Improve horizontal fusion of contiguous (#1292) · 18e4a2c6
  Paul Fultz II authored Aug 16, 2022
```
* Horizontally fuse contiguous
```
  18e4a2c6
07 Jul, 2022 1 commit

Add a step to unsqeeze axis (#1242) · bd503d89

Paul Fultz II authored Jul 07, 2022

Instead of just unsqueezing to an axis of 1 a step can be set to use instead. So instead of unsqueezing {3, 12} to {3, 1, 12} a step of 2 will unsqeeze to {3, 2, 6} instead

bd503d89

05 Jul, 2022 1 commit

Horizontally fuse contiguous operators (#1232) · 27e980c4

Paul Fultz II authored Jul 05, 2022

This reorders the transposes across slice to improve horizontal fusion for contiguous. This also improves eliminate_contiguous to remove contiguous better across splits.

27e980c4

22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
17 May, 2022 1 commit
- renamed variables for module from p to m (#1204) · a27dd28c
  shivadbhavsar authored May 17, 2022
```
Updated variable names according to #1193
```
  a27dd28c
11 May, 2022 1 commit

Prefuse layernorm for gpu (#1190) · 671f24be

Paul Fultz II authored May 11, 2022

Fuse layernorm and added triadd_layernorm fusion.  This is a prep performance booster

671f24be

02 Mar, 2022 1 commit
- Clang format ver10 (#1106) · 9852aaef
  bpickrel authored Mar 02, 2022
```
Update the base version of clang-format from 5.0 to 10.0
```
  9852aaef
03 Nov, 2021 1 commit

Add tests for the DepthToSpace+Binary pointwise operations fusion (#987) · eb6abd27

Umang Yadav authored Nov 03, 2021

In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.

If there is trailing binary pointwise operator after depthToSpace then, migraphx can move binary operator before contiguous and reshape of the depthtospce.

So, it becomes reshape-->transpose-->binary_op-->contiguous-->reshape.

Explicit contiguous wouldn't be required since binary_op outputs standard shape. So, it becomes reshape-->transpose-->binary-->reshape.

simplify_reshapes already has matcher that can do this transformation. This PR adds test for cases like depthtospace +binary op.

solves #905

eb6abd27

28 Oct, 2021 1 commit

DepthToSpace and pointwise unary operations fusion (#986) · cf0b6d6d

Umang Yadav authored Oct 28, 2021

In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.

This PR adds matcher to find d2s + unary pointwise ops.

Application of the matcher moves the pointwise unary operation before the contiguous and reshape of the d2s.
So it becomes
reshape --> transpose --> unary --> contiguous --> reshape.

Motivation is that, later pointwise module would be created out of unary --> contiguous --> reshape. Codegen for this pointwise module can write out buffer such that explicit contiguous and reshape wouldn't be required.

This transformation is not always guaranteed to improve performance, since unary op will operate on non-standard shape. So, we would need some tuning mechanism to make decision.

#905 pending PR for binary operations.

cf0b6d6d

24 Aug, 2021 1 commit

Change attributes names to be more consistent and reflect better meaning (#916) · 0d2606bb

Umang Yadav authored Aug 24, 2021

* rename broadcast and multibroadcast output_lens attribute to out_lens attribute, and change tests and source code to reflect the same

* change the reshape attribute from dims to out_lens

* change transpose attribute's name from dims to perm to reflect better meaning

* use permutation instead of perm for transpose

clang formaating

* use dims instead of out_lens for reshape

clang formatting

0d2606bb

23 May, 2021 1 commit
- Remove vector of indices and use a lazy iota range instead (#838) · e8738144
  Paul Fultz II authored May 23, 2021
```
* Create lazy range

* Formatting

* Use lazy iota
```
  e8738144
23 Apr, 2021 1 commit

Optimize resize and where operators (#784) · 17485202

Shucai Xiao authored Apr 23, 2021



* code backup

* clang format

* add a matcher related to the special resize case for optimization

* clang format

* code backup

* clang format

* code backup

* remove unnecessary code

* add optimization for the where op

* clang format

* fix cppcheck error

* add a unit test for optimize resize

* clang format

* remove unnecessary header include

* code backup

* clang format

* add unit tests for optimizing resize

* clang format

* add more unit test for optimizing where op

* clang format

* remove unnecessary code

* add one more optimzation to remove contiguous

* clang format

* add a pointwise requirement

* clang format

* fix cppcheck error

* add one more unit test

* fixed a bug

* clang format

* remove unnecessary code

* clang format

* fix a build error

* fix review comments

* clang format

* fix a review comments

* clang format

* code refinement

* clang format

* refine more code

* refine more code

* fix a bug related to reshape_cont optimization

* clang format

* fix a review comment

* removed an unnecessary comment

* refine code according to comments

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

17485202

08 Feb, 2021 1 commit

Add a pass to remove unsupported data types (#738) · 3d24a21c

Paul Fultz II authored Feb 07, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

3d24a21c

18 Jan, 2021 1 commit

Refactor to use tune_axis function (#713) · 651ea160

kahmed10 authored Jan 18, 2021

* initial testing

* initial testing

* add dequantize

* formatting

* add tests

* formatting

* revert file

* add parse files

* formatting

* add axis tuning and fix tests

* formatting

* add tests and fix int8

* formatting

* fix tidy

* test with int32

* add default name and change string to upper

* formatting

* remove boost call

* refactor to use tune_axis)

* formatting

651ea160

08 Dec, 2020 1 commit

Refactor to use make_op almost everywhere (#696) · 8d21fdc9

Paul Fultz II authored Dec 08, 2020

* Load op when serializing

* Formatting

* Add missing clip field

* Use make_op almost everywhere

* Formatting

* More make ops for rnns

* Get rid of spaces

* Formatting

* Remove operators headers

* Formatting

* Remove unused op headers

* Increase line threshold

8d21fdc9

11 Nov, 2020 1 commit

Refactor program to module (#684) · 2466dd6f

Shucai Xiao authored Nov 11, 2020



* code backup

* clang format

* change corresponding tool files

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2466dd6f

28 Oct, 2020 1 commit

Fix bert fusions (#666) · 2ea40daa

Paul Fultz II authored Oct 28, 2020



* Fix fusions in bert model

* Formatting

* Add unit tests

* Formatting

* Fix one_half matcher

* Workaround ICE on gcc

* Formatting

* Tidy fixes
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2ea40daa

21 Sep, 2020 1 commit

Concat transpose bug (#638) · 16a03b39

Shucai Xiao authored Sep 21, 2020



* fix a bug related to concat transpose.

* clang format

* use return instruction to replace the fake instruction
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

16a03b39

10 Jul, 2020 1 commit

Optimize multiply across slices (#568) · e66968a2

Paul Fultz II authored Jul 10, 2020



* Add initial optimization when using a mul over a sliced convolution

* Formatting

* Add more tests

* Formatting

* Convert to an assert

* Check if used once

* Formatting

* Add test with horiz fusion

* Formatting

* Optimize nested slice

* Formatting

* Fix test

* Add const refs

* Remove unnecessary assert
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

e66968a2

16 Oct, 2019 1 commit
- Flatten nested concats (#391) · 50e6d5eb
  Paul Fultz II authored Oct 16, 2019
```
* Flatten nested concats

* Formatting

* Rename tests
```
  50e6d5eb
15 Oct, 2019 1 commit
- Add more shape operators that can be nops (#390) · 5f2767aa
  Paul Fultz II authored Oct 15, 2019
```
* Add more shape operators that can be nops

* Dont remove pooling
```
  5f2767aa
03 Oct, 2019 1 commit

Improve contiguous and concat performance (#368) · 9b55685c

Paul Fultz II authored Oct 03, 2019

* Add env to trace nary device functions

* Formatting

* Improve contiguous and concat performance

* Formatting

* Remove unused variable

* Formatting

* Fix gpu tests

* Formatting

* Add more test for transposed concat

* Formatting

* Compute offset and not index

* Compute multi-index once

* Formatting

* Fix transposed inputs

* Formatting

* Use product order for comparisons of hip_array

* Formatting

* Add missing s parameter

* Formatting

* Dont invert permutation

* Fix tidy warnings

* Formatting

* Remove incorrect license

* Use a single integer for stride

* Formatting

* Fix tidy issue

9b55685c

26 Sep, 2019 1 commit
- Fix exception thrown when compiling inceptionv4 (#367) · 7cc3243c
  Paul Fultz II authored Sep 26, 2019
```
* Fix compiler crash in TF inceptionv4

* Formatting

* Remove else
```
  7cc3243c
15 Aug, 2019 2 commits
- formatting · d5c85636
  Khalique authored Aug 15, 2019
  
  d5c85636
- remove comment · 924a01e8
  Khalique authored Aug 15, 2019
  
  924a01e8
06 Jul, 2019 1 commit
- use const ref for parameters · 02c28d6a
  Paul authored Jul 05, 2019
  
  02c28d6a
02 Jul, 2019 6 commits
- formatting · cb9f01ae
  Khalique authored Jul 02, 2019
  
  cb9f01ae
- added fixes to reading strided slice · 523eabeb
  Khalique authored Jul 02, 2019
  
  523eabeb
- Formatting · 8c8e5fb0
  Paul authored Jul 02, 2019
  
  8c8e5fb0
- Fix permutation inversion · 139eaeed
  Paul authored Jul 02, 2019
  
  139eaeed
- Formatting · c87f5621
  Paul authored Jul 01, 2019
  
  c87f5621
- Use lazy match operators so it will still short-circuit · fa485ae6
  Paul authored Jul 01, 2019
  
  fa485ae6
01 Jul, 2019 3 commits
- Formatting · 6d56671b
  Paul authored Jul 01, 2019
  
  6d56671b
- Fix tf test · 3b64f602
  Paul authored Jul 01, 2019
  
  3b64f602
- Remove unused predicate · ea2f0cf4
  Paul authored Jun 30, 2019
  
  ea2f0cf4