Commits · 610f04178383d04cc94bc6f9b2bc1f270c4fd691 · gaoqiong / MIGraphX

26 Jun, 2022 2 commits
- Format · 610f0417
  Paul authored Jun 25, 2022
  
  610f0417
- Add layernorm post fusion · f3aa2c67
  Paul authored Jun 25, 2022
  
  f3aa2c67
25 Jun, 2022 1 commit
- Use jit for contiguous operator (#1217) · b75c83d8
  Paul Fultz II authored Jun 24, 2022
```
* Jit contiguous
```
  b75c83d8
22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
02 Jun, 2022 1 commit
- Fix dangling reference with gemm add fusion (#1233) · 1339ba35
  Paul Fultz II authored Jun 01, 2022
  
  1339ba35
26 May, 2022 1 commit
- Upgrade to cppcheck 2.8 and fix new issues found (#1225) · a401e72a
  Paul Fultz II authored May 26, 2022
```
* Upgrade to cppcheck 2.8
```
  a401e72a
24 May, 2022 1 commit
- Fuse gemm add with pointwise fusions (#1213) · a500620e
  Paul Fultz II authored May 24, 2022
```
* Fuse gemm add with pointwise fusions
```
  a500620e
17 May, 2022 1 commit
- renamed variables for module from p to m (#1204) · a27dd28c
  shivadbhavsar authored May 17, 2022
```
Updated variable names according to #1193
```
  a27dd28c
11 May, 2022 1 commit

Prefuse layernorm for gpu (#1190) · 671f24be

Paul Fultz II authored May 11, 2022

Fuse layernorm and added triadd_layernorm fusion.  This is a prep performance booster

671f24be

08 Feb, 2022 1 commit

Add missing output_alias to miopen_fusion op (#1076) · b304d97d

Paul Fultz II authored Feb 08, 2022

This causes incorrect memory coloring, which was causing the accuracy failures in the vision model when enabling the pointwise fusions. Resnet50, inceptionv3 and inceptionv4 do verify now in the driver.

b304d97d

10 Jan, 2022 1 commit
- Handle miopen fusions when using pointwise fusions (#1019) · 534a05c1
  Paul Fultz II authored Jan 10, 2022
```
* Add matcher for conv_bias pointwise
* Add fusion op
```
  534a05c1
30 Nov, 2021 1 commit
- Fix fusable_conv whitespace bug (#1008) · 9270ebaf
  turneram authored Nov 30, 2021
```
Fix whitespace bug in fusable_conv matcher and add unit test
```
  9270ebaf
09 Nov, 2021 1 commit
- Failing fusion plan workaround (#995) · fb39e5e4
  turneram authored Nov 09, 2021
```
* Add workaround for devices that do not support miopen conv fusions
```
  fb39e5e4
08 Oct, 2021 1 commit

Remove alpha and beta from `dot` and `quant_dot` (#961) · 21193e87

Umang Yadav authored Oct 08, 2021

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.

21193e87

17 Sep, 2021 2 commits

Revert "Remove alpha and beta attributes from dot operator (#945)" (#957) · 985f58b0
Paul Fultz II authored Sep 17, 2021
```
This reverts commit 9e43cb8b.
```
985f58b0

Remove alpha and beta attributes from dot operator (#945) · 9e43cb8b

Umang Yadav authored Sep 17, 2021

This PR aims to remove alpha and beta attributes from dot operator completely.

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

9e43cb8b

09 Jun, 2021 1 commit

Asym pad refactor (#791) · 9a5e0c06

kahmed10 authored Jun 09, 2021



* alternative impl

* formatting

* add gpu pass to insert pad

* formatting

* update onnx test, still need cleanup

* formatting

* update tf_test

* modify existing tests

* formatting

* remove print

* code cleanup

* formatting

* code cleanup

* formatting

* fix tidy and cppcheck

* remove variable

* add test

* formatting

* add test and address comments

* formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9a5e0c06

25 Mar, 2021 1 commit

Add cpu fusion for gelu and layernorm (#761) · 728d083d

Paul Fultz II authored Mar 25, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Make input a const ref

* Make this explicit for gcc 5
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

728d083d

08 Jan, 2021 1 commit

Revamp CI infrastucture (#706) · ceb4ca09

Paul Fultz II authored Jan 08, 2021



* Add build and test github workflow

* Fix cget command

* Remove def-requirements.txt

* Add tmate session to debug workflow

* Run tmate session after installing dependencies

* Print date periodically

* Add clang tidy action

* Seperate build and run container in two different jobs

* Run bash script

* Remove interactive flag

* Try to mount the files

* Try to use the github workspace

* WIthout double braces

* Use env variable

* Pipe bash script in

* Run using hip-clang

* Use correct path

* Add verbose

* Remove j flag

* Only run for onnx file to debug

* Manually run clang-tidy

* Remove quiet flag

* Print header file

* Printout environment

* Remove extra defines

* Remove fixits and config flag

* Show ldd

* Add tmate session

* Run onnx protobuf first

* Generate proto for tensorflow

* Update cppcheck version

* Fix some cppcheck issues

* Add const

* Cppcheck fixes

* Formatting

* Fix more cppcheck issues

* Run two jobs

* Cache analysis and run format checking

* Fix yaml issues

* Fix yaml issues

* Fix indentation

* Switch to hip-clang for main docker file

* Use hip-clang in the readme

* Fixes for jenkins

* Use ccache to build

* Combine file

* Set restore keys

* Change stage name

* Build with ccache

* Add missing dependency for ccache

* Build debug with codecov

* Fix workflow syntax

* Fix list

* Use quotes

* Got to correct build path

* Install lcov

* Use sudo

* Echo all commands

* Setup tmate

* Add verbose output

* Build with cmake directly

* Add pthread flag

* Remove python config

* Continue on error

* Use on or off for cmake flag

* Use always upload cache

* Verbose output

* Verbose output from build

* Build one target

* Reduce debug symbols

* Increase garbage collection

* Remove dmesg

* Increase it to 20

* Update rocm cmake version

* Remove jobs from jenkins

* Run on all 3 ubuntus

* Remove gcc 5 jobs

* Dont add flag on 16.04

* Only upload coverage on 18.04

* Dont build for ubuntu 20.04

* Use matrix.os

* Use O2 for hip-clang since lower optimizations are broken

* Use rocm 3.0

* Pass ccache as cmake variable instead of env variable

* Build miopen from source

* Show ccache statistics

* Print log information

* Set compression level

* Use hash dir

* Set hashdir

* Install clang ocl from system

* Up compression level

* Add locale

* Increase cache size to 1G

* Lower compression level to 9

* Remove split dwarf

* Remove Og

* Add back Og

* Seperate debug and codecov

* Add missing backlash

* Garbage collect more often

* Add missing locales package

* Use Os

* Install onednn in docker and run tests

* Include target headers in tests

* Increase timeout

* Remove if condtion

* Make flag public

* Suppress memory leaks in onednn

* Use equal

* Add gh annotations

* Update rocm-cmake version

* Add ldconfig
Co-authored-by: Shucai Xiao <shucai@gmail.com>

ceb4ca09

06 Jan, 2021 1 commit

Module impl (#678) · c9b86f1c

Shucai Xiao authored Jan 06, 2021



* add an api get_main_module

* clang format

* modify onnx unit test for module

* clang format

* refactor ops unit test with the get_main_module

* clang format

* code backup

* clang format

* refine module c api

* add python api for module

* clang format

* fix a python api issue

* clang format

* fix cppcheck error

* clang format

* refine unit tests changes

* clang format

* code backup

* code backup

* clang format

* defer some changes to later PRs

* change return of get_main_module from ref to pointer

* clang format

* add unit tests for the get_main_module_api

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* clang format

* add more unit tests for more code change coverage

* clang format

* fixed a unit test error

* clang format

* fix unit test

* clang format

* code backup

* code change for more code coverage

* change program to module in various passes and matcher

* clang format

* modify the pass API

* code backup

* code backup

* clang format

* code backup

* clang format

* Add option to no generate a destroy method

* Formatting

* fix some review comments

* clang format

* fix review comments

* clang format

* clang format

* code backup

* code backup

* clang format

* fix cppcheck errors

* clang format

* clang format

* fix build errors

* clang format

* modify gpu unit tests to using module

* clang format

* fix cppcheck error

* clang format

* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Formatting

* fix review comments

* code backup

* clang format

* code backup

* clang format

* fix a bug related to a unit test

* clang format

* clang format

* fix a build error

* remove unnecessary code

* remove unnecessary files

* code backup

* clang format

* remove the compile function from the module class

* clang format

* clang format

* remove the context parameter from the from_value method of the module class

* code refinement

* clang format

* merge changes from develop branch

* clang format

* fix cppcheck error

* clang format

* fix a build error

* fixed a merge error

* fix cppcheck error

* fixed review comments

* clang format

* fix cppcheck error

* fix a cppcheck error

* fix cppcheck error

* fix build error caused by merge

* Add missing has_op function

* Formatting

* merge changes from develop branch

* fix a cppcheck error

* fixed some review comments

* clang format

* remove the begin/end function of the program class

* clang format

* refine code and fix cppcheck error

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* add unit tests for more code coverage

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* fix a build error in debug mode

* clang format
Co-authored-by: Paul <pfultz2@yahoo.com>

c9b86f1c

26 Nov, 2020 1 commit

Gelu fp16 (#674) · e09d54fe

kahmed10 authored Nov 25, 2020



* initial testing

* change tolerance

* remove extra changes
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

e09d54fe

20 Nov, 2020 1 commit

Fuse skip layernorm (#683) · 1bfb147d

Paul Fultz II authored Nov 20, 2020



* Unify the vectorized and non-vectorized path

* Formatting

* Make fusion easily extendable

* Add skip layernorm fusion

* Formatting

* Call correct layernorm function

* Fix compile errors

* Add DCE

* Add test for skip layernorm

* Formatting

* Remove unused typedef

* Formatting

* Fix tidy issues

* Formatting
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>

1bfb147d

11 Nov, 2020 1 commit

Refactor program to module (#684) · 2466dd6f

Shucai Xiao authored Nov 11, 2020



* code backup

* clang format

* change corresponding tool files

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2466dd6f

28 Oct, 2020 1 commit

Fix bert fusions (#666) · 2ea40daa

Paul Fultz II authored Oct 28, 2020



* Fix fusions in bert model

* Formatting

* Add unit tests

* Formatting

* Fix one_half matcher

* Workaround ICE on gcc

* Formatting

* Tidy fixes
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2ea40daa

08 Oct, 2020 1 commit

Add build flag for fast math (#639) · a5065265

kahmed10 authored Oct 08, 2020



* add flag

* formatting

* remove env variable

* fix api expression

* add api test

* add api test

* add op test

* formatting

* fix function name

* fix syntax

* formatting

* modify test

* remove test and update doc

* move test to new file

* formatting

* revert test files

* rewrite check

* New
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

a5065265

14 Sep, 2020 1 commit

Some perf improvements to bert (#627) · 9f283810

Paul Fultz II authored Sep 14, 2020



* Fuse gemm in fuse ops

* Formatting

* Add const ref

* Remove assert

* Skip already fused gemms

* Skip already fused gemm

* Formatting

* Use float_equal

* Avoid non-standard shapes for inputs

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9f283810

25 Aug, 2020 1 commit

Improve layernorm performance (#613) · 56b3bf58

Paul Fultz II authored Aug 25, 2020

* Use increment instead of division to compute register offset

* Formatting

* Limit layernorm to 1024 elements

* Formatting

* Add verification to driver

* Formatting

* Remove early return

* Use block_size 256

* Vectorize the kernel

* Formatting

* Convert to vector type

* Add layernorm tests

* Formatting

* Formatting

* Refactor layernorm to run both algos

* Formatting

* Fix compile error

* Fix tidy warnings

* Formatting

* Add layernorm function

* Formatting

56b3bf58

21 Aug, 2020 1 commit

rename hip to gpu (#610) · 1ca3c133

kahmed10 authored Aug 21, 2020


Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

1ca3c133

19 Aug, 2020 1 commit

move init miopen fusion operator to finalize method (#606) · 453517ad

Shucai Xiao authored Aug 19, 2020

* move initialization of miopen fusion operators to finalize method

* clang format

* fix cppcheck error

* clang format

* fix review comments

* clang format

* removed an unnecessary assert

453517ad

18 Aug, 2020 1 commit

Paul Fultz II authored Aug 18, 2020

* Register ops for main migraphx

* Formatting

* Register cpu ops

* Formatting

* Show list of operators in the driver

* Formatting

* Simplify regiter

* Try to register gpu ops

* Fix compiler errors

* Register rest of the gpu operators

* Add some tests

* Formatting

* Fix gcc compiler warnings

* Formatting

* Fix tidy warnings

* Fix compile error

* Use correct op name

* Register layer norm

* Use const ref

* Make run const

e8be8548

14 Aug, 2020 1 commit

Layernorm onnx support (#599) · 2c5d5fee

kahmed10 authored Aug 14, 2020



* fix pad calc

* bert tf passes correctness

* formatting

* add test

* formatting

* remove comment

* add inline

* formatting

* fix order for literal

* formatting

* test no mul_add

* formatting

* debug layernorm

* debug layernorm

* manual merge

* more progress

* formatting

* remove miopen batchnorm

* remove headers

* Fix compile error with no dpp reductions

* fix indices

* formatting

* change matcher

* formatting

* remove binds

* formatting

* disable tf matcher

* formatting

* use fast div

* formatting

* fix matcher

* formatting

* remove comment

* move find_matches

* add assert

* formatting

* fix deepcode issue
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2c5d5fee

12 Aug, 2020 1 commit

Add dimensionality reduction to device functions (#587) · 63563da2

Paul Fultz II authored Aug 12, 2020



* Add reduce dims

* Formatting

* Reduce dims on the gpu

* Formatting

* Fix tidy issues

* Convert to assert

* Reduce dims for contiguous

* Formatting

* Remove move

* Fix arguments used

* Formatting

* Fix warnings

* Formatting
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>

63563da2

08 Jun, 2020 1 commit

Enable read support for n-dimensional ops (#537) · cb722cf9

kahmed10 authored Jun 08, 2020



* initial progress

* formatting

* add pooling changes

* formatting

* change eliminate_pad

* formatting

* rename var

* fomratting

* update op shape test and compute

* formatting

* revert conv constructor

* formatting

* change initializer

* formatting

* fix tidy

* change quant conv and shape check

* add tests and fixes

* formatting

* fix type

* fix conv test

* formatting

* add pooling and bn tests

* formatting

* add inconsistent attr tests

* fix padding issue

* formatting

* fix review comments, remove duplicate test

* formatting

* fix variable

* fix assert bug

* fix attr check

* remove std
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

cb722cf9

03 Jun, 2020 1 commit

Bert fuse slice reshape trans contiguous (#542) · 93be5e2b

Shucai Xiao authored Jun 03, 2020



* fix pad calc

* Add decompose pass

* Add decompose test

* Formatting

* bert tf passes correctness

* formatting

* Add remap

* Formatting

* add test

* formatting

* remove comment

* Add compute method for dot

* Formatting

* add inline

* Add finder for horizontal fusion

* Formatting

* Formatting

* Reuse predicate

* formatting

* fix order for literal

* formatting

* add test for gelu

* formatting

* added add_gelu fusion

* Add gemm fusions

* Formatting

* add files

* formatting

* test no mul_add

* formatting

* progress on div

* formatting

* continue work on pass

* remove layernorm opt

* revert reduce file

* Add some fixes for convolution

* Formatting

* Fix shape tests

* Formatting

* Reuse axis equal

* Add initial split fusion

* Formatting

* Update offset

* Workaround outputs that cant accept nonstandard shapes

* Formatting

* Add check for split concat

* Formatting

* Add missing headers

* Formatting

* Add tests

* Formatting

* add optimization for bert

* code backup for bert optimization

* continue testing

* formatting

* fix matcher

* formatting

* add gelu_fn and tests

* formatting

* fix matcher, remove extra tests

* formatting

* fix matcher

* add missing files

* add find_layernorm

* add add_transpose to cmake file

* code backup for the contigous fusion

* refine ops fusion

* clang format

* fixed bug in previous optimization

* clang format

* add more optimization

* remove unnecessary code

* refinement of the fustion code

* clang format

* fixed a bug

* add used_once

* formatting

* start on new gelu

* formatting

* add matchers in fuse_ops

* formatting

* add dce to fix add_gelu

* add simplify_rsqrt and test

* formatting

* debugging value for matcher

* formatting

* add more to matchers

* formatting

* fix errors

* remove onnx gen

* add any_arg, change matchers to use either_arg

* formatting

* clang format

* formatting

* add used_once

* formatting

* code cleanup

* clang format

* fixed a bug

* remove unnecessary code

* refine comments

* optimize bert to remove more contiguous

* clang format

* remove unnecessary code

* add unit tests for bert optimization

* clang format

* fix review comments

* clang format

* refine a fusion of reshape and slice

* clang format

* fix cppcheck error

* fix review comments

* add the fusion of slice and transpose

* clang format

* add another optimization to fuse slice and transpose

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* fix review comments
Co-authored-by: Khalique <15948690+kahmed10@users.noreply.github.com>
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>
Co-authored-by: Shucai Xiao <scxiao@prj47-rack-99.local.lan>

93be5e2b

15 May, 2020 1 commit

Add gelu optimization (#521) · 0079028a

kahmed10 authored May 15, 2020



* fix pad calc

* bert tf passes correctness

* formatting

* add test

* formatting

* remove comment

* add inline

* formatting

* fix order for literal

* formatting

* add test for gelu

* formatting

* added add_gelu fusion

* add files

* formatting

* remove layernorm opt

* revert reduce file

* add gelu_fn and tests

* formatting

* fix matcher, remove extra tests

* formatting

* fix matcher

* add used_once

* formatting

* start on new gelu

* formatting

* add matchers in fuse_ops

* formatting

* add dce to fix add_gelu

* add simplify_rsqrt and test

* formatting

* debugging value for matcher

* formatting

* add more to matchers

* formatting

* fix errors

* remove onnx gen

* add any_arg, change matchers to use either_arg

* formatting

* formatting

* add used_once

* formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0079028a

08 May, 2020 1 commit

Horizontal fusions of gemms and convolutions (#472) · 1a4ff504

Paul Fultz II authored May 08, 2020



* Add decompose pass

* Add decompose test

* Formatting

* Add remap

* Formatting

* Add compute method for dot

* Formatting

* Add finder for horizontal fusion

* Formatting

* Formatting

* Reuse predicate

* Add gemm fusions

* Formatting

* Add some fixes for convolution

* Formatting

* Fix shape tests

* Formatting

* Reuse axis equal

* Add initial split fusion

* Formatting

* Update offset

* Workaround outputs that cant accept nonstandard shapes

* Formatting

* Add check for split concat

* Formatting

* Add missing headers

* Formatting

* Add tests

* Formatting

* Add more testing

* Formatting

* Fix when there is duplicate splits in inputs

* Formatting

* Fix mismatch iterators

* Add tests for dot fusions

* Formatting

* Add test for convolution

* Formatting

* Fix tidy issues

* Add more tests

* Formatting

* Ignore build directory for codecov

* Add test for groups

* Formatting

* Add more tests for groups

* Formatting

* Add test for missing end slice

* Add newline

* Remove unused function

* Add support for when beta is not 1

* Formatting

* Add test for scalar

* Add one more scalar test
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

1a4ff504

29 Mar, 2020 1 commit

Clip update for onnx (#455) · 0325c1a4

kahmed10 authored Mar 29, 2020



* fix pad calc

* modify clip for more args

* formatting

* add test, flip order, revert to unary

* fix error msg

* add min and max args to clip

* formatting

* fixes to quantization

* formatting

* fix logic and add extra test

* formatting

* fix logic, add extra test

* formatting

* fix bug in test
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

0325c1a4

20 Dec, 2019 1 commit

Improve operators for onnxruntime (#405) · 992666e6

Shucai Xiao authored Dec 20, 2019



* improve unsqueeze to support negative axis and parsing scalar

* clang format

* add a test example for the negative axis of unsqueeze

* improve the squeeze operator to support negative axis

* clang format

* fixed a small bug in the lrn implementation

* clang format

* support negative axis in argmax and argmin

* clang format

* improve flatten to support negative axis

* clang format

* change softmax/logsoftmax to support negative axis

* clang format

* improve transpose by adding default perm

* clang format

* add one more dimens for tensor size

* add one more dimens for tensor size

* disable conv ops fusion for non-symmetric cases

* clang format

* fixed review comments

* move computing axis from the device function to the compute function

* clang format

* move computing axis from device function to the operator computing function

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

992666e6

09 Oct, 2019 1 commit

Fix bug in bert accuraccy (#385) · a797f890

Paul Fultz II authored Oct 09, 2019

* Fix bug in bert accuraccy

* Formatting

* add another test

* Fix add and overflow

* Formatting

* Fix bug in shape_for_each

* Use front instead of iterator

* Use result.front()

* Split add_unary files

* Formatting

* Fix incorrect last index

* Remove comment

* Inline function

* Fix carry check

* Fix metadata errors

* Formatting

* Reflow

* Reflow

a797f890

04 Oct, 2019 1 commit

Add_clip fusion (#370) · 1398bcc1

kahmed10 authored Oct 04, 2019

* initial testing of add_clip fusion

* formatting

* clipped relu fusion

* formatting

* remove some executables, add fusion test

* formatting

* remove clipped_relu code

* fix clang-tidy

* revert changes to cmake files

* remove fusion from weight map

* formatting

* fix syntax error

* formatting

* fix syntax error

* fix syntax error

* formatting

1398bcc1