Commits · 47dbf16479ae7ad1bf8869c2ca08ddb692764d09 · gaoqiong / MIGraphX

05 Jul, 2022 6 commits
- Replace skip_broadcasts with skip_broadcasts_converts · 47dbf164
  Ted Themistokleous authored Jul 05, 2022
```
Use this call to also skip converts when running a simplify_algebra pass over a program.
```
  47dbf164
- fixup! fixup! Add skip_broadcasts to simplify algebra value matchers · cd8c2f8e
  Ted Themistokleous authored Jul 05, 2022
  
  cd8c2f8e
- fixup! Add skip_broadcasts to simplify algebra value matchers · f715ad3f
  Ted Themistokleous authored Jul 05, 2022
  
  f715ad3f
- Remove variables for find_zero_div_const · 16e6cd5a
  Ted Themistokleous authored Jul 05, 2022
```
Allows us to not throw warnings instead of using [[maybe_unused]] flag instead.
```
  16e6cd5a
- Add skip_broadcasts to simplify algebra value matchers · 48492136
  Ted Themistokleous authored Jul 05, 2022
```
Adds this to handle broadcasted values instead of just scalars
```
  48492136
- Change threshold for has_value tolerance · 4aeacc17
  Ted Themistokleous authored Jul 05, 2022
```
Used to avoid the case where  1e-12 is used and is matched as zero errornously resulting in removing the call with the incorrect value.
```
  4aeacc17
01 Jul, 2022 1 commit
- fixup! fixup! fixup! Simplify algebra for 0*x, x*0 and 0/x operations · 3fc986f0
  Ted Themistokleous authored Jul 01, 2022
  
  3fc986f0
30 Jun, 2022 16 commits
- fixup! fixup! Simplify algebra for 0*x, x*0 and 0/x operations · 4b7210e1
  Ted Themistokleous authored Jun 30, 2022
  
  4b7210e1
- fixup! Simplify algebra for 0*x, x*0 and 0/x operations · d6764b7d
  Ted Themistokleous authored Jun 30, 2022
  
  d6764b7d
- fixup! Add support for division by zero errors · 41e3b9f6
  Ted Themistokleous authored Jun 30, 2022
  
  41e3b9f6
- Add support for division by zero errors · c2d4ceb0
  Ted Themistokleous authored Jun 30, 2022
```
Throw an exception when this occurs to indicate our simpliciation passes resulted in a singularity somewhere. Related to #1236
```
  c2d4ceb0
- Simplify algebra for 0*x, x*0 and 0/x operations · 20d2b5d9
  Ted Themistokleous authored Jun 30, 2022
```
Simplify addition zero multiplication and divide operations. Added approrpiate test cases with returns and replacing the instruction and operand to just return zero.
```
  20d2b5d9
- Simplify x - 0 and 0 - x operations · 93f733c3
  Ted Themistokleous authored Jun 29, 2022
```
Using the unit/neg unit matchers to handle subtraction operations in the same steps. Added unit tests for both cases.
```
  93f733c3
- Combine negative unit args · 62c67b45
  Ted Themistokleous authored Jun 29, 2022
  
  62c67b45
- Simplify algebra of negative division operations · 013e18cf
  Ted Themistokleous authored Jun 28, 2022
```
Part of changes that go wtih #1236. Reverts -1 divide operations to a simple negation of the parameter
```
  013e18cf
- Remove addition matcher and roll in with div/mul unit matchers · 86c464ca
  Ted Themistokleous authored Jun 29, 2022
```
Add handling for zero addition operations into the find_unit_ops() matcher functor.
```
  86c464ca
- Simplify Algebra of 0+x & x+0 = x · 8ddb505a
  Ted Themistokleous authored Jun 29, 2022
```
Added test case and code to simplify zero additions between paremeters and literals during simplifications.  In reference to issue #1236
```
  8ddb505a
- Combine matchers for unit div and multiplcations · e080cf48
  Ted Themistokleous authored Jun 29, 2022
```
Simplfies our code for all operations and reusing original unit tests for overalpping matcher.
```
  e080cf48
- Simplify algebra for x / 1 operations · 37fe2f04
  Ted Themistokleous authored Jun 24, 2022
```
Done to satisfy simplifications specified by #1236 .  Just replace every  parameter divided by 1 with itself. It's assumed that the eliminate_identity() pass will handle generated identity operators in our run_pass()
```
  37fe2f04
- Simplifiy Algebra for x*(-1) operations to simply negative x · 61ef5263
  Ted Themistokleous authored Jun 27, 2022
```
Save a multiply operation with that of a negation of  input parameter x. Suggested improvement via #1236
```
  61ef5263
- fixup! Initial algebraic simplification for x*1 · e9f247b5
  Ted Themistokleous authored Jun 27, 2022
  
  e9f247b5
- Initial algebraic simplification for x*1 · 49406cee
  Ted Themistokleous authored Jun 23, 2022
```
Commit for the day, work in progress as I'm failing one of our unit tests outside of the change
```
  49406cee
- Add method to insert multiple instructions (#1178) · 2783c649
  Paul Fultz II authored Jun 29, 2022
```
This is an extension to insert_module_instructions, but instead of just inserting from a module, it can insert a range or a vector of instructions.
```
  2783c649
29 Jun, 2022 2 commits
- NMS refactor, enable nonstandard shape (#1257) · ad73abbc
  Charlie Lin authored Jun 29, 2022
```
Allows PyTorch converted version of SSD-resnet34 to work
```
  ad73abbc
- Update driver models to use json strings (#1244) · ad27d0d6
  Paul Fultz II authored Jun 29, 2022
```
 Compiles significantly faster than constructing all the objects. It also reduces recompiles as well.
```
  ad27d0d6
26 Jun, 2022 1 commit
- Get parent module in the pass manager (#1181) · 3a5c4306
  Paul Fultz II authored Jun 26, 2022
```
* Add function to get a module tree
* Get parent module in the pass manager
```
  3a5c4306
25 Jun, 2022 2 commits
- bug fix: register the miopen_fusion op. (#1267) · 3b0a9116
  Brian Pickrell authored Jun 24, 2022
```
One-line fix to register the op miopen_fusion. This error was causing loading of compiled model files (*.mxr) to fail.
```
  3b0a9116
- Use jit for contiguous operator (#1217) · b75c83d8
  Paul Fultz II authored Jun 24, 2022
```
* Jit contiguous
```
  b75c83d8
24 Jun, 2022 1 commit

Add compute_method for the experimental custom op (#1194) · edc7be5c

Umang Yadav authored Jun 24, 2022

Adds compute_method for the experimental custom ops.
Adds a test for the same using HIP APIs.
Depends on #1183
Solves #1101

edc7be5c

23 Jun, 2022 1 commit
- remove eliminate_workspace pass (#1254) · f5760e21
  kahmed10 authored Jun 23, 2022
```
* remove eliminate workspace
* remove sync device and other tags
```
  f5760e21
22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
20 Jun, 2022 1 commit
- Fixing misspelled macro to enable MIOpen hidden find mode API (#1250) · c0398ded
  Zhuoran Yin authored Jun 20, 2022
```
* Fixing misspelled macro
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
```
  c0398ded
17 Jun, 2022 3 commits

Update lowering of Dot operator (#1247) · c99be32c

Umang Yadav authored Jun 17, 2022



* remove code for allocation of C param in dot lowering

* formatting
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

c99be32c

Update tf_parser to have add_common_op() for parse_relu6 (#1241) · 421a5621

Ted Themistokleous authored Jun 17, 2022



* [#935] Update tf_parser to have add_common_op() for parse_relu6

Similar to that of the onnx_parser.cpp add a add_common_op template and functionality to support clip based operations. This is done so clip operations can be guarenteed to have the same dimensions.

* fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* fixup! fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* Formatting

* fixup! Formatting
Co-authored-by: Umang Yadav <29876643+umangyadav@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

421a5621

Create allocate op and replace_allocate pass (#1183) · add6fb3b

kahmed10 authored Jun 17, 2022



* add allocate op header

* formatting

* add replace_allocate pass

* formatting

* move output param to remove_allocate pass

* formatting

* fix bugs in replace_allocate pass

* formatting

* fix verify if tests

* formatting

* move if op logic

* formatting

* cleanup lowering

* cleanup lowering

* formatting

* fix tidy

* formatting

* fix tidy

* add cpu allocate check

* formatting

* change cpu allocate in pass

* formatting

* add some tests for replace_allocate pass

* formatting

* pass by ref

* fix run_pass

* formatting

* update variable name for module

* update dce to use contains() and fix tidy

* formatting

* update cppcheck

* add if test

* formatting

* add if test

* rename var to mod_output_names

* formatting

* remove conditional

* update allocate op and tests

* formatting

* update replace_allocate tests

* update create_output_names() and conditional in replace_allocate

* formatting

* remove extra variable in replace_allocate

* update tools script for allocation_model
Co-authored-by: Umang Yadav <29876643+umangyadav@users.noreply.github.com>
Co-authored-by: Chris Austen <causten@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

add6fb3b

16 Jun, 2022 1 commit

Instruction distance check fix (#1237) · f5980619

Charlie Lin authored Jun 16, 2022



* Use custom distance function

* Pass module, skip order check if other module

* Change other valid()

* Remove unnecessary declaration

* test multiple module dependency

* Refactor to make more clear

* Code cleanup

* Simplify fix

* Test EXPECT
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

f5980619

10 Jun, 2022 1 commit

Add vectorized reduce (#1202) · aa7ff911

Paul Fultz II authored Jun 09, 2022



Consolidate the vectorize and preload
Add vectorization to reduction
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

aa7ff911

07 Jun, 2022 1 commit

Prioritizing int8 over int8x4 when it is applicable (#1218) · 37c47504

Zhuoran Yin authored Jun 07, 2022



prioritizing int8 over int8x4 when it is applicable
Amend return to continue in apply loop
Adding error handling in case int8x4 compilation failed
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

37c47504

03 Jun, 2022 1 commit

Group code objects by kernel name in perf report summary (#1234) · 7271ddbc

Paul Fultz II authored Jun 02, 2022

Break up the gpu::code_object  print to show the actual kernels...

gpu::code_object::add_kernel: 0.646121ms, 5%
gpu::code_object::mul_kernel: 0.623822ms, 5%
gpu::code_object::add_mul_erf_add_mul_mul_kernel: 0.498902ms, 4%
gpu::code_object::mul_add_kernel: 0.478352ms, 4%

7271ddbc

02 Jun, 2022 1 commit
- Fix dangling reference with gemm add fusion (#1233) · 1339ba35
  Paul Fultz II authored Jun 01, 2022
  
  1339ba35