Commits · c6ec66385733d7aab00600d517a71a46cd651fae · gaoqiong / MIGraphX

15 Nov, 2023 1 commit

Support per-axis quantization (#2390) · 0039b11a

shivadbhavsar authored Nov 15, 2023

Reworked the simplify_qdq pass to support:

Per-axis quantization (ie. allow 1D scales and zero points)
Allow broadcast and transpose ops between dq and quant_op

0039b11a

11 Oct, 2023 1 commit
- a few c++ fixes to allow compilation on Windows (#2282) · a50cb302
  Artur Wojcik authored Oct 11, 2023
  
  a50cb302
07 Sep, 2023 1 commit
- Fuse pointwise modules across reshapes (#1940) · 1d8840b8
  Paul Fultz II authored Sep 07, 2023
  
  1d8840b8
28 Aug, 2023 1 commit
- Use matcher name for the `TRACE_MATCHES_FOR` (#2127) · 4bce712a
  Umang Yadav authored Aug 28, 2023
  
  4bce712a
30 Jun, 2023 1 commit
- Add a MIGRAPHX_TRACE_MATCHES_FOR env variable to filter the matcher trace (#1858) · ad618aa9
  Paul Fultz II authored Jun 29, 2023
  
  ad618aa9
17 Jun, 2023 1 commit

Add trace for SIMPLIFY_ALGEBRA matches (#1838) · a0fa3742

Ted Themistokleous authored Jun 17, 2023

* Add trace for SIMPLIFY_ALGEBRA matches

* Fix format

* handle review comments from Umang

-int to size_t for trace
-move env arg to top of simplify_algebra.cpp
-handle overload beter for find_matches

* Rename trace_mod param to trace_pass

More representative naming for what this trace flag does

a0fa3742

09 Jun, 2023 1 commit
- Fix compile warnings for shadowing variable names (#1825) · dfde6d07
  Umang Yadav authored Jun 09, 2023
  
  dfde6d07
30 May, 2023 1 commit
- Add option to use type erased matchers to reduce symbol names (#1755) · 55f420fb
  Paul Fultz II authored May 30, 2023
  
  55f420fb
10 Apr, 2023 1 commit

Fix 2 input broadcast bug for dynamic batch and output parameter ordering (#1669) · d3eb5609

Charlie Lin authored Apr 10, 2023

Adds a matcher to split_single_dyn_dim to find all broadcast or multibroadcast with two static shape inputs and replaces the instruction with the one input version.
Sorts the get_output_parameters() list to ensure the correct ordering. (Was getting an error for some models.)

d3eb5609

05 Apr, 2023 1 commit
- Add MIGRAPHX_VALIDATE_MATCHES env variable to validate each matcher (#1372) · a123cb2e
  Paul Fultz II authored Apr 05, 2023
```
* Add MIGRAPHX_VALIDATE_MATCHES env variable to validate each matcher
```
  a123cb2e
27 Aug, 2022 1 commit

Improvements to handling and add constant passed to dot operator (#1280) · 8752875a

Paul Fultz II authored Aug 26, 2022

This will rewrite dot operators like X(Y + b) to XY + Xb when b is constant as we can fold the add away.
This improves handling pointwise with broadcasted operators, this helps improves const propagation.
Improve gemm fusion with a mul_add
Improve support for broadcast shapes in gemm

8752875a

03 Jul, 2022 1 commit

Add mlir fusion (#1251) · ca8a54fe

Paul Fultz II authored Jul 03, 2022

* Add mlir c api

* Formatting

* Create a type attribute

* Formatting

* Parse module

* Formatting

* Add mlir dump function

* Add test case

* Formatting

* Fix tidy issues

* Update mlit version

* Update to newer mlir

* Format

* Move mlir to the gpu and update the test

* Formatting

* Fix bug when appending module

* Format

* Remove old cmake flag

* Update message

* Add return

* Format

* Add mlir_compile

* Format

* Register dialect

* Handle unsinged integers

* Dont provide output for return instruction

* Format

* Add code to insert memrefs

* Format

* Add mlir verification

* Formatting

* Enable pointwise_fusion

* Disable eliminate_data_type

* Set kernal name

* Format

* Fix device name

* Formatting

* Fix output arg

* Format

* Updates

* Upate hash

* Add fuse_mlir pass

* Format

* Add fuse mlir

* Format

* Update mlir

* Sort parameter names

* Format

* Reenable disabled passes

* Remove old mlir conv

* Remove asym default padding

* Add more verbose tracing

* Format

* Fix compilation errors

* Format

* Whitelist operators

* Format

* Add namespace

* Format

* Update triple

* Format

* Use func dialect

* Format

* Use func.return

* Format

* Upgrade mlir version

* Add comment

* Handle symetrical padding

* Format

* Cleanup debug output

* Format

* List failed tests

* Move mlir compile to jit pipeline

* Format

* Update version

* Add source locations

* Format

* Correctly add module

* Format

* Update failed tests

* Fix failures when mlir is disabled

* Format

* Update mlir version

* Check type for fp32

* Format

* Remove failed test

* Update mlir in driver

* Tidy fixes

* Foramt

* Tidy fixes

* Format

* Fix const

* Remove from requirements

* Fix cmake version

* Fix tidy warning

* Use another ifdef

* Fix tidy

* Other tidy fix

* Format

* Update hash

* Add missing license files

* Format

* Format

* Fix fnction name

ca8a54fe

22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
20 May, 2022 1 commit
- Improve matching with has_value when there are convert operators (#1212) · 27af0170
  Paul Fultz II authored May 19, 2022
  
  27af0170
11 May, 2022 1 commit

Prefuse layernorm for gpu (#1190) · 671f24be

Paul Fultz II authored May 11, 2022

Fuse layernorm and added triadd_layernorm fusion.  This is a prep performance booster

671f24be

02 Mar, 2022 1 commit
- Clang format ver10 (#1106) · 9852aaef
  bpickrel authored Mar 02, 2022
```
Update the base version of clang-format from 5.0 to 10.0
```
  9852aaef
18 Aug, 2021 1 commit

Optimize Q/DQ Format Pass (#889) · 0b5f33b6

turneram authored Aug 18, 2021

* Add operators, refactor parsers, add rewrite passes, add tests

* Add ref implementations

* Move broadcasting of scales and zero points to onnx parser

* Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type

* Switch certain variables to int64_t

* Fix overflow in implicit constant conversion

* Remove operators.hpp from includes in tf_test.cpp

* Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes

* Switch dequantizelinear math from int32 to float

* Remove changes to operators.hpp

* Simplify apply_quantizelinear

* Add verify test for int32 data

* Add rewrite_quantization back to CMakeLists

* Add passes to insert qdq after add_bias is applied, replace quant_ops, and remove remaining qdq pairs

* Renaming, refactoring, cleaning up code, adding formal test, and adding passes to targets

* Renaming, review comments, begin adding more specific tests

* Add more specific unit tests

* Fix failing test on CI

* Correct matcher and update qop rewriting, update tests and add more tests

* Update matcher, clean up simplify_qdq, tweak tests

* Add tests, remove pass from CPU target, update dot parameters, clean up simplify_qdq

* Fix correctness bug in ref q/dq implementations; edit gemm parser to make beta always 0.0

* Remove unused variables in onnx gemm tests

0b5f33b6

13 Jul, 2021 1 commit

Fix compile errors with ubuntu 20.04 (#880) · 59a2954a

Paul Fultz II authored Jul 13, 2021

* Add build for ubuntu 20.04

* Fix ambiguous overload resolution with stream

* Fix warning

* Capture by value

* Format

59a2954a

10 Jun, 2021 1 commit

Dont match or bind to global instructions (#826) · c72a047f

Paul Fultz II authored Jun 10, 2021



* Add optional header

* Formatting

* Use optional in the matcher

* Foramtting

* Remove program from tests

* Formatting

* Dont bind or match non-local variables

* Formatting

* Fix gcc 5 error

* Format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

c72a047f

23 Apr, 2021 1 commit

Optimize resize and where operators (#784) · 17485202

Shucai Xiao authored Apr 23, 2021



* code backup

* clang format

* add a matcher related to the special resize case for optimization

* clang format

* code backup

* clang format

* code backup

* remove unnecessary code

* add optimization for the where op

* clang format

* fix cppcheck error

* add a unit test for optimize resize

* clang format

* remove unnecessary header include

* code backup

* clang format

* add unit tests for optimizing resize

* clang format

* add more unit test for optimizing where op

* clang format

* remove unnecessary code

* add one more optimzation to remove contiguous

* clang format

* add a pointwise requirement

* clang format

* fix cppcheck error

* add one more unit test

* fixed a bug

* clang format

* remove unnecessary code

* clang format

* fix a build error

* fix review comments

* clang format

* fix a review comments

* clang format

* code refinement

* clang format

* refine more code

* refine more code

* fix a bug related to reshape_cont optimization

* clang format

* fix a review comment

* removed an unnecessary comment

* refine code according to comments

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

17485202

22 Apr, 2021 1 commit

Cpu fusions using post_ops (#781) · f7befe50

Paul Fultz II authored Apr 22, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Add post op field

* Formatting

* Make post_ops serializable

* Formatting

* Add eltwise fusions

* Formatting

* Fix null conversions

* Formatting

* Add fuse_ops source files

* Formatting

* Set binary post op index correctly

* Formatting

* Fix serialization bugs

* Check if used once

* Formatting

* Fix error in get_primitive_attr

* Formatting

* Add compile function

* Formatting

* Limit fusions

* Formatting

* Disable with env variable instead of using compile arg

* Formatting

* Fix implicit conversion to bool

* Declar on seperate lines

* Formatting

* Fix cppcheck issues

* Fix ICE in pack_join

* Formatting

* Use const ref

* Make enum hashable

* Formatting

* Add explicit this

* Fix merge issues

* Fix dangling ref

* Formatting

* Add test for compile

* Formatting

* Add more value tests

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

f7befe50

05 Apr, 2021 1 commit

Module build exec (#765) · 41c0487b

Shucai Xiao authored Apr 05, 2021



* code cleanup

* clang format

* backup code

* clang format

* remove unnecessary code

* clang format

* add module print function

* code backup

* refine the module::print function

* refine the module:to_value() function

* code backup

* backup code changes

* code backup

* remove to_value and from_value function from the module class

* rename a function

* rename the if operator

* refine the if operator

* refine the print function of module and program

* code backup

* code backup

* fix a build warning

* fix overload of compute_shape function

* code backup

* fix unit test error

* fix cppcheck error

* fix the issue related to the overload of compute_shape

* fix review comments

* fix cppcheck error

* change the return name of if_op to be if

* clang format

* fix two unit tests

* clang format

* rename variables

* clang format

* remove the unused compute_op function

* clang format

* add lowering of if operator and compute_op function

* clang format

* add parsing if operator in onnx file

* clang format

* fix clang tidy format

* clang format

* add the gpu implementation of the if operator

* enhance the validate function and uncomment a unit test

* clang format

* remove unnecessary code

* add sub_module processing in ref passes

* clang format

* clang format

* fix a hang issue related to the valid function

* fix an issue in replace_refs

* clang format

* fix review comments

* clang format

* fix cppcheck error

* clang format

* add a unit test for more code coverage

* clang format

* fix review comments and add test for more code coverage

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* fix a cppcheck error

* clang format

* backup code

* clang format

* fix cppcheck error

* clang format

* some code refinement

* clang format

* code backup to handle submodules in module compilation

* clang format

* code backup

* clang format

* code backup

* clang format

* fix a bug related to literal id

* fix a bug in gpu execution

* change the way of compiling a graph

* clang format

* backup more changes

* clang format

* refine pass log information

* remove unnecessary code

* clang format

* temp changes backup

* clang format

* add module name prefix to scratch memory id in hip_memory_allocation

* clang format

* change to copy the cond input by inserting a copy instruction

* clang format

* change to use the if output argument as the submodule output so can remove a gpu_copy

* clang format

* consider submodule in some compile passes

* clang format

* fix review comments

* clang format

* fix issues related to scratch memory

* clang format

* remove unnecessary code

* fix cppcheck error

* clang format

* reslove the implicit dependencies issue related to submodule

* clang format

* fix cppcheck error

* clang format

* backup temp changes

* clang format

* fixed an bug in the has_instruction function

* clang format

* fix the return value of the gpu implementation of the if operator

* fix a bug in the compute_shape function in the gpu implementation

* add an if onnx unit test

* clang format

* add more unit tests

* clang format

* tmp code backup

* clang format

* fix a sync problem related to copy cond argument from gpu to cpu

* clang format

* change the compile offload copy flag setting

* clang format

* enable copy from cpu to be able to do synchronous copy

* clang format

* add more unit tests

* add more unit tests

* add more ref unit tests

* clang format

* fixed a bug error

* tmp code backup

* clang format

* fixed an onnx verify unit test

* add more unit tests

* clang format

* reverse a change

* fix cppcheck error

* fix cppcheck error

* fix to print all instructions in program execution

* clang format

* fix bugs related to memory coloring and offload copy to be true

* clang format

* remove unnecessary include header file

* sort test cases in ref_cpu_ops alphabetically

* clang format

* add a flag to disable cpu target in verification test

* change the way to disable some tests

* clang format

* disable verify unit test of the if operators

* add a function call to have more code coverage

* fix a build error

* fix review comments

* fix review comments

* clang format

* add a api gpu unit test for more code coverage

* clang format

* change to use instruction.size() as node index

* move the calc_implicit_deps function to module class as a member function

* clang format

* move the offload_copy flag setting to lowering

* clang format

* assign the module_eval lambda function to a variable to simplify code

* clang format

* move the compute function from ref/gpu implementation to the main if operator

* clang format

* fix cpp check error

* add a unit test for more code coverage

* clang format

* add unit test to calculate implicit deps

* add a python unit test

* clang format

* refine a unit test to have more code coverage

* clang format

* chang the way of wrap up arguments for sub modules

* clang format

* fix some build errors

* code cleanup

* refine unit tests to have more code coverage

* clang format

* refine unit test to have more code coverage

* code backup

* clang format

* add memory coloring test

* refine memory coloring unit test

* clang format

* remove an unnecessary line

* remove an unused line

* remove an unnecessary parameter in the lambda function

* clang format

* refine a unit test

* remove an unnecessary line

* refine unit tests to have more code coverage

* clang format

* combine two lines

* add one more unit test for more code coverage

* clang format

* add one more unit test

* clang format

* fix review comments

* refine a print out information

* fix review comments

* clang format

* change the sync copy to using a gpu device sync

* clang format

* remove unnecessary code
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

41c0487b

25 Mar, 2021 1 commit

Add cpu fusion for gelu and layernorm (#761) · 728d083d

Paul Fultz II authored Mar 25, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Make input a const ref

* Make this explicit for gcc 5
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

728d083d

18 Mar, 2021 1 commit

Add tf C++ API (#770) · 51fb672d

kahmed10 authored Mar 18, 2021



* fix relu6

* add more transposes

* add parse_tf calls

* progress on multi_outputs

* formatting

* add multi output test

* add comment and update migraphx.py

* fix compile

* formatting

* update tools/api

* formatting

* fix function call

* fix generate

* simplify tests

* formatting

* rename tests

* enclose braces

* add more tests

* update comments

* rename file and add default param

* formatting

* fix tidy and change type

* formatting older files
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

51fb672d

03 Mar, 2021 1 commit

Bug simplify algegra (#753) · 0fa539da

Shucai Xiao authored Mar 03, 2021



* fix issue#727

* clang format

* refine unit tests

* fix cppcheck error

* fix review comments

* refine a unit test to cover more code changes

* fix cppcheck error

* remove unnecessary include file

* fix review comments

* clang format
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0fa539da

08 Jan, 2021 1 commit

Revamp CI infrastucture (#706) · ceb4ca09

Paul Fultz II authored Jan 08, 2021



* Add build and test github workflow

* Fix cget command

* Remove def-requirements.txt

* Add tmate session to debug workflow

* Run tmate session after installing dependencies

* Print date periodically

* Add clang tidy action

* Seperate build and run container in two different jobs

* Run bash script

* Remove interactive flag

* Try to mount the files

* Try to use the github workspace

* WIthout double braces

* Use env variable

* Pipe bash script in

* Run using hip-clang

* Use correct path

* Add verbose

* Remove j flag

* Only run for onnx file to debug

* Manually run clang-tidy

* Remove quiet flag

* Print header file

* Printout environment

* Remove extra defines

* Remove fixits and config flag

* Show ldd

* Add tmate session

* Run onnx protobuf first

* Generate proto for tensorflow

* Update cppcheck version

* Fix some cppcheck issues

* Add const

* Cppcheck fixes

* Formatting

* Fix more cppcheck issues

* Run two jobs

* Cache analysis and run format checking

* Fix yaml issues

* Fix yaml issues

* Fix indentation

* Switch to hip-clang for main docker file

* Use hip-clang in the readme

* Fixes for jenkins

* Use ccache to build

* Combine file

* Set restore keys

* Change stage name

* Build with ccache

* Add missing dependency for ccache

* Build debug with codecov

* Fix workflow syntax

* Fix list

* Use quotes

* Got to correct build path

* Install lcov

* Use sudo

* Echo all commands

* Setup tmate

* Add verbose output

* Build with cmake directly

* Add pthread flag

* Remove python config

* Continue on error

* Use on or off for cmake flag

* Use always upload cache

* Verbose output

* Verbose output from build

* Build one target

* Reduce debug symbols

* Increase garbage collection

* Remove dmesg

* Increase it to 20

* Update rocm cmake version

* Remove jobs from jenkins

* Run on all 3 ubuntus

* Remove gcc 5 jobs

* Dont add flag on 16.04

* Only upload coverage on 18.04

* Dont build for ubuntu 20.04

* Use matrix.os

* Use O2 for hip-clang since lower optimizations are broken

* Use rocm 3.0

* Pass ccache as cmake variable instead of env variable

* Build miopen from source

* Show ccache statistics

* Print log information

* Set compression level

* Use hash dir

* Set hashdir

* Install clang ocl from system

* Up compression level

* Add locale

* Increase cache size to 1G

* Lower compression level to 9

* Remove split dwarf

* Remove Og

* Add back Og

* Seperate debug and codecov

* Add missing backlash

* Garbage collect more often

* Add missing locales package

* Use Os

* Install onednn in docker and run tests

* Include target headers in tests

* Increase timeout

* Remove if condtion

* Make flag public

* Suppress memory leaks in onednn

* Use equal

* Add gh annotations

* Update rocm-cmake version

* Add ldconfig
Co-authored-by: Shucai Xiao <shucai@gmail.com>

ceb4ca09

11 Nov, 2020 1 commit

Refactor program to module (#684) · 2466dd6f

Shucai Xiao authored Nov 11, 2020



* code backup

* clang format

* change corresponding tool files

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2466dd6f

28 Oct, 2020 1 commit

Fix bert fusions (#666) · 2ea40daa

Paul Fultz II authored Oct 28, 2020



* Fix fusions in bert model

* Formatting

* Add unit tests

* Formatting

* Fix one_half matcher

* Workaround ICE on gcc

* Formatting

* Tidy fixes
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2ea40daa

16 Sep, 2020 1 commit

Add pointwise attribute to operators (#634) · 24933bd8

Paul Fultz II authored Sep 16, 2020



* Add pointwise attribute

* Formatting

* Fix compilation

* Remove unused variable

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

24933bd8

22 Jul, 2020 1 commit

Gelu matcher fix (#583) · 49a31c65

kahmed10 authored Jul 22, 2020



* fix matcher

* formatting

* change has_value function

* formatting

* check for literal in matcher
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

49a31c65

21 May, 2020 1 commit

Skip fusing group convolutions (#531) · bfbf0c27

Paul Fultz II authored May 21, 2020



* Skip fusing group convolutions

* Formatting

* Fix ICE on gcc 5

* Formatting

* Fix gcc check

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

bfbf0c27

15 May, 2020 1 commit

Add gelu optimization (#521) · 0079028a

kahmed10 authored May 15, 2020



* fix pad calc

* bert tf passes correctness

* formatting

* add test

* formatting

* remove comment

* add inline

* formatting

* fix order for literal

* formatting

* add test for gelu

* formatting

* added add_gelu fusion

* add files

* formatting

* remove layernorm opt

* revert reduce file

* add gelu_fn and tests

* formatting

* fix matcher, remove extra tests

* formatting

* fix matcher

* add used_once

* formatting

* start on new gelu

* formatting

* add matchers in fuse_ops

* formatting

* add dce to fix add_gelu

* add simplify_rsqrt and test

* formatting

* debugging value for matcher

* formatting

* add more to matchers

* formatting

* fix errors

* remove onnx gen

* add any_arg, change matchers to use either_arg

* formatting

* formatting

* add used_once

* formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0079028a

26 Aug, 2019 1 commit
- Fix matcher bugs in either_arg · 03afd1ed
  Paul authored Aug 26, 2019
  
  03afd1ed
20 Aug, 2019 1 commit
- Fix tidy warnings · 43a96492
  Paul authored Aug 20, 2019
  
  43a96492
15 Aug, 2019 1 commit
- Update cppcheck version · e5f07268
  Paul authored Aug 15, 2019
  
  e5f07268
13 Aug, 2019 3 commits
- Fix tidy warnings · 4070855d
  Paul authored Aug 12, 2019
  
  4070855d
- Formatting · a6ea39ea
  Paul authored Aug 12, 2019
  
  a6ea39ea
- Make sure recursions are use only once · c46b5480
  Paul authored Aug 12, 2019
  
  c46b5480
10 Jul, 2019 1 commit
- Fix tidy issues · d55d8d24
  Paul authored Jul 10, 2019
  
  d55d8d24
09 Jul, 2019 1 commit
- Formatting · b6f9911f
  Paul authored Jul 09, 2019
  
  b6f9911f