Commits · 9e43cb8b772219de3a47d9ab9e4ed3cbcccc11bc · gaoqiong / MIGraphX

17 Sep, 2021 1 commit

Remove alpha and beta attributes from dot operator (#945) · 9e43cb8b

Umang Yadav authored Sep 17, 2021

This PR aims to remove alpha and beta attributes from dot operator completely.

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.

9e43cb8b

16 Sep, 2021 1 commit

Loop operator (#853) · a275f590

Shucai Xiao authored Sep 16, 2021

Add Loop operator for opset version 13.
Notes: 1) Default max iteration number is 10 if no max iteration number is provided
2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model.
3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later.
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a275f590

02 Sep, 2021 2 commits

Refactor where op (#918) · ebbaf8fc

turneram authored Sep 02, 2021

Implement the Where operator for the CPU and GPU.  This is for better performance.

ebbaf8fc

Topk op (#877) · 521b57a2

Shucai Xiao authored Sep 01, 2021



* add topk operator doe ref, cpu and gpu
* Hash modules for quicker lookup of modules
* add onnx unit test
* add unit tests for the topk operator
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

521b57a2

24 Aug, 2021 1 commit

Change attributes names to be more consistent and reflect better meaning (#916) · 0d2606bb

Umang Yadav authored Aug 24, 2021

* rename broadcast and multibroadcast output_lens attribute to out_lens attribute, and change tests and source code to reflect the same

* change the reshape attribute from dims to out_lens

* change transpose attribute's name from dims to perm to reflect better meaning

* use permutation instead of perm for transpose

clang formaating

* use dims instead of out_lens for reshape

clang formatting

0d2606bb

18 Aug, 2021 1 commit
- Fix error: namespace "std" has no member "cout" (#911) · 4e3b2e3c
  turneram authored Aug 18, 2021
```
Co-authored-by: Chris Austen <causten@users.noreply.github.com>
```
  4e3b2e3c
09 Aug, 2021 1 commit
- check for divisor encodable or not, fallback if needed (#906) · a8d86615
  Cagri Eryilmaz authored Aug 09, 2021
```
* check for divisor encodable or not, fallback if needed

* verify test for retinaface case
```
  a8d86615
15 Jul, 2021 1 commit

Quantize linear ops (#843) · 3282e01a

turneram authored Jul 15, 2021

* Add operators, refactor parsers, add rewrite passes, add tests

* Formatting

* Fix cppcheck

* Review comments

* Formatting

* Combine rewrite passes

* Formatting

* Add ref implementations

* Formatting

* Review comments

* Formatting

* Tidy warnings

* Apply review comments

* Formatting

* Fix CI error

* Formatting

* Increase code coverage

* Formatting

* Move broadcasting of scales and zero points to onnx parser

* Formatting

* Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type

* Formatting

* Increase code coverage

* Formatting

* Switch certain variables to int64_t

* Formatting

* Fix overflow in implicit constant conversion

* Formatting

* Increase code coverage

* Formatting

* Remove operators.hpp from includes in tf_test.cpp

* Formatting

* Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes

* Formatting

* Switch dequantizelinear math from int32 to float

* Formatting

* Remove changes to operators.hpp

* Simplify apply_quantizelinear

* Formatting

* Add verify test for int32 data

* Add rewrite_quantization back to CMakeLists

3282e01a

13 Jul, 2021 1 commit

Fix compile errors with ubuntu 20.04 (#880) · 59a2954a

Paul Fultz II authored Jul 13, 2021

* Add build for ubuntu 20.04

* Fix ambiguous overload resolution with stream

* Fix warning

* Capture by value

* Format

59a2954a

08 Jul, 2021 2 commits

Add inclusive scan on the GPU (#872) · 6ba279cc

Paul Fultz II authored Jul 08, 2021



* Add initial scan operator

* Formatting

* Fix with a working test

* Fix bugs

* Formatting

* Formatting

* Simplify

* Formatting

* Use non-power of 2 for test

* Make pointer
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

6ba279cc

Preallocate parameters on the CPU and unify preallocations (#840) · 427fc25c

Paul Fultz II authored Jul 08, 2021



* Add preallocate method

* Add preallocate_param pass

* Preallocate buffers on the cpu

* Formatting

* Preallocate on the gpu

* Add missing cpp file

* Formatting

* Add lifetime function

* Formatting

* Always allocate

* Fix tidy warning

* Add const

* Add missing lifetime annotations
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

427fc25c

06 Jul, 2021 1 commit

Update test driver to continue executing after exceptions and other failures (#868) · f60c3815

Paul Fultz II authored Jul 06, 2021



* Improve handling of exceptions in test driver

* Formatting

* Auto print exception

* Formatting

* Fork each test case

* Formatting

* Exclude gcc 5 debug build

* Fix tidy issues

* Add color

* Formatting

* Create driver class

* Formatting

* Customize test_case names

* Formatting

* Report status from forked processes

* Formatting

* Update the verify driver

* Formatting

* Print out failed tests

* Formatting

* Fix tidy issues

* Formatting

* Expect passing

* Improve failure reporting on non-linux systems

* Fix ifdef

* Flush code code cov

* Formatting

* Fix tidy

* Check if weak symbols is linked

* Formatting

* Add continue flag

* Formatting

* Set exe name

* Use stringstream and use quotes
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

f60c3815

01 Jul, 2021 1 commit

Step op for parse_slice.cpp (#858) · 85d93d23

Cagri Eryilmaz authored Jul 01, 2021



* slice+step+reverse: parsing + onnxfile + test

* cleanup

* updates for step operator, abs axis

* test updates

* updates to tests

* slice+reverse verify test

* verify test for slice+step+reverse

* clang format

* reverse with lens

* simplify normalize_compute_shape

* step op: normalization for axes

* clang format

* change of order: fixing wrong functionality in some cases

* test for slice,reverse,step

* clang format
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

85d93d23

27 Jun, 2021 1 commit

Inline subgraph (#802) · bc52a8a8

Shucai Xiao authored Jun 27, 2021



* Add definitions for all pointwise operators

* Formatting

* Add cpp generator class

* Formatting

* Move compilation to core

* Formatting

* Add clock to tmp name

* Add dynamic loader

* Formatting

* Add tests for code gen

* Formatting

* Add test for literals

* Formatting

* Use with_char

* Add missing header

* Fix mismerge

* Ignore tidy warning

* Fxx gcc 5 errors

* Apply fixits

* Skip signed bitwise of status

* Remove unused parameters

* Explicitly add c++14 flag

* Fix tidy warning

* unify the compute function signature

* clang format

* make another change

* unify the compute function

* clang format

* remove unnecessary code

* more refinement about the operator compute funciton

* clang format

* add an overload function

* clang format

* add support for axes inputs for sequeeze/unsqueeze/reduce_sum

* clang format

* fix build problems

* backup code changes

* clang format

* Add tuple type to shape class

* Formatting

* fix a bug in parsing quantizelinear operator

* clang format

* fix a cppcheck error

* disable different versions of unit tests for different onnx version

* clang format

* upgrade onnx to 1.8

* update onnx to 1.8.1

* disable two more real models

* clang format

* Make data member private

* Formatting

* Add sub arguments

* Formatting

* Trun clang format off

* Disable clang-format

* fix review comments

* fix the function of assign axes in parsing the squeeze operator

* add unit tests and fix a bug

* clang format

* fix review comments

* clang format

* fix a build error

* backup code changes

* clang format

* add more unit tests and add parsing opset version

* clang format

* Improve visiting tuples

* Formatting

* fix cppcheck error

* adding installing the onnx package

* resolve no protobuf compiler

* add an inline subgraph pass

* clang format

* Add more argument tests

* Formatting

* Handle tuple in load

* Formatting

* code backup

* clang format

* Remove .o files

* Add tuple type to api

* Formatting

* fix build errors

* clang format

* code backup

* code backup

* add unit tests for the inline subgraph

* clang format

* refine the inline subgraph and parse if operator

* clang format

* fix cppcheck issue

* clang format

* add unit test for inline subgraph pass

* clang format

* fix format issue

* remove the context from the if operator

* clang format

* simplify the compute functions

* Fix tidy warnings

* fix cppcheck error

* clang format

* fix cppcheck error

* Fix tidy warnings

* fix a cppcheck error

* clang format

* Add a test for share method

* Formatting

* Add a test cpp_type

* add unit tests for more code coverage

* clang format

* add unit tests to have more code coverage

* clang format

* try a comment in jenkins build

* include the install onnnx line

* code backup

* reorder the dependenciesd installed

* refine dockerfile

* fix review comments

* clang format

* remove unnecessary overload function

* fix cppcheck error

* change back the argument test

* Suppress tidy warning

* add the operator get_tuple_elem

* clang format

* add get_tuple_elem to operator include file

* chang if to support multiple operation outputs

* clang format

* optimize inline subgraph

* clang format

* code backup

* clang format

* fix bug

* refine unit tests for tuple output of the if operator

* clang format

* refine a instruction replacement code

* add a unit test and sort all the unit tests alphabetically

* fix cppcheck error

* add more unit tests for multiple op outputs

* clang format

* fix cppcheck error

* Update pass manager to get modules after every pass

* more unit test to cover more scenarios

* clang format

* fixed a bug in a unit test

* add more tests

* clang format

* add more unit tests to have more code coverage

* fix a bug in a unit test

* Add program overload for module

* Formatting

* Hash modules for quicker lookup of modules

* Bump file version

* Add methods to remove modules

* Formatting

* add the tuple type to the support list

* Eliminate unused modules

* Formatting

* Fix test errors

* Foramtting

* Fix tidy issues

* fix problem related to inline subgraph

* clang format

* fix review comments

* fix review comments

* fix review comments

* fix review comments

* clang format

* fix a unit test

* one more code change

* remove an optimization related to the if operator

* clang format

* fix review comments
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

bc52a8a8

25 Jun, 2021 3 commits
- refine unit tests · dd651742
  Shucai Xiao authored Jun 25, 2021
  
  dd651742
- clang format · 814793ec
  Shucai Xiao authored Jun 25, 2021
  
  814793ec
- add unit tests · c31c616c
  Shucai Xiao authored Jun 25, 2021
  
  c31c616c
15 Jun, 2021 1 commit

Int8 gemm support (#811) · 39bc6161

Shucai Xiao authored Jun 15, 2021



* add a flag to indicate int8x4 input format

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* remove log info

* remove unnecessary changes

* fix cppcheck error

* add unit tests to have more code coverage

* clang format

* add debug info

* remove log info

* fix cppcheck error

* clang format

* clang format

* add one more unit tests for more scenarios

* fix cppcheck error

* clang format

* fix review comments

* clang format

* rename p to m

* fix review comments

* refine unit tests

* clang format

* refine unit tests and fixed a bug

* clang format

* fix build error related to rocm4.2

* fix a bug related to alpha and beta

* refine two unit tests related to int8_gemm

* fix cppcheck error

* refine unit test to pass on mi100

* add unit test for packing int8 args

* clang format

* change unit tests back

* disable some unit tests for gpu

* clang format

* refine unit tests to run on mi100

* clang format

* refine unit tests

* refine unit tests

* clang format

* change back a unit test
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

39bc6161

08 Jun, 2021 1 commit

Reverse Op (#846) · 9c54fc4f

Cagri Eryilmaz authored Jun 08, 2021



* init reverseOp branch: ref op + ref test. WIP

* first passing basic test

* cleanup

* additional axis implementation

* additional test

* ref op implementation vec to int for axis

* ref op test change for axis

* initial gpu files and test

* updates to implementation and test

* fixed some issues

* clang format

* cleanup

* formatting

* removing comments

* remove local size, back to default

* update tests: replace with std functions

* multiple axis for reverse op

* fix a build error

* clang format

* more tests

* fix a bug for the reverse device function

* clang format

* fix a bug

* clang format

* ref test updates, multiaxis

* formatting
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9c54fc4f

26 May, 2021 1 commit

Step op (#839) · 04065c64

Shucai Xiao authored May 26, 2021



* add the operator step

* clang formatJ

* add unit tests

* clang format

* add more unit test for step op

* clang format

* add more unit tests

* clang format

* fix review comments

* clang format

* rename two unit tests
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

04065c64

03 May, 2021 1 commit

Reduce types generated for hip kernels (#814) · 3becd974

Paul Fultz II authored May 03, 2021



* Remove unused data types

* Formatting

* Reduce types generated for hip kernels

* Formatting

* Fix onnx tests

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

3becd974

22 Apr, 2021 1 commit

Cpu fusions using post_ops (#781) · f7befe50

Paul Fultz II authored Apr 22, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Add post op field

* Formatting

* Make post_ops serializable

* Formatting

* Add eltwise fusions

* Formatting

* Fix null conversions

* Formatting

* Add fuse_ops source files

* Formatting

* Set binary post op index correctly

* Formatting

* Fix serialization bugs

* Check if used once

* Formatting

* Fix error in get_primitive_attr

* Formatting

* Add compile function

* Formatting

* Limit fusions

* Formatting

* Disable with env variable instead of using compile arg

* Formatting

* Fix implicit conversion to bool

* Declar on seperate lines

* Formatting

* Fix cppcheck issues

* Fix ICE in pack_join

* Formatting

* Use const ref

* Make enum hashable

* Formatting

* Add explicit this

* Fix merge issues

* Fix dangling ref

* Formatting

* Add test for compile

* Formatting

* Add more value tests

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

f7befe50

05 Apr, 2021 1 commit

Module build exec (#765) · 41c0487b

Shucai Xiao authored Apr 05, 2021



* code cleanup

* clang format

* backup code

* clang format

* remove unnecessary code

* clang format

* add module print function

* code backup

* refine the module::print function

* refine the module:to_value() function

* code backup

* backup code changes

* code backup

* remove to_value and from_value function from the module class

* rename a function

* rename the if operator

* refine the if operator

* refine the print function of module and program

* code backup

* code backup

* fix a build warning

* fix overload of compute_shape function

* code backup

* fix unit test error

* fix cppcheck error

* fix the issue related to the overload of compute_shape

* fix review comments

* fix cppcheck error

* change the return name of if_op to be if

* clang format

* fix two unit tests

* clang format

* rename variables

* clang format

* remove the unused compute_op function

* clang format

* add lowering of if operator and compute_op function

* clang format

* add parsing if operator in onnx file

* clang format

* fix clang tidy format

* clang format

* add the gpu implementation of the if operator

* enhance the validate function and uncomment a unit test

* clang format

* remove unnecessary code

* add sub_module processing in ref passes

* clang format

* clang format

* fix a hang issue related to the valid function

* fix an issue in replace_refs

* clang format

* fix review comments

* clang format

* fix cppcheck error

* clang format

* add a unit test for more code coverage

* clang format

* fix review comments and add test for more code coverage

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* fix a cppcheck error

* clang format

* backup code

* clang format

* fix cppcheck error

* clang format

* some code refinement

* clang format

* code backup to handle submodules in module compilation

* clang format

* code backup

* clang format

* code backup

* clang format

* fix a bug related to literal id

* fix a bug in gpu execution

* change the way of compiling a graph

* clang format

* backup more changes

* clang format

* refine pass log information

* remove unnecessary code

* clang format

* temp changes backup

* clang format

* add module name prefix to scratch memory id in hip_memory_allocation

* clang format

* change to copy the cond input by inserting a copy instruction

* clang format

* change to use the if output argument as the submodule output so can remove a gpu_copy

* clang format

* consider submodule in some compile passes

* clang format

* fix review comments

* clang format

* fix issues related to scratch memory

* clang format

* remove unnecessary code

* fix cppcheck error

* clang format

* reslove the implicit dependencies issue related to submodule

* clang format

* fix cppcheck error

* clang format

* backup temp changes

* clang format

* fixed an bug in the has_instruction function

* clang format

* fix the return value of the gpu implementation of the if operator

* fix a bug in the compute_shape function in the gpu implementation

* add an if onnx unit test

* clang format

* add more unit tests

* clang format

* tmp code backup

* clang format

* fix a sync problem related to copy cond argument from gpu to cpu

* clang format

* change the compile offload copy flag setting

* clang format

* enable copy from cpu to be able to do synchronous copy

* clang format

* add more unit tests

* add more unit tests

* add more ref unit tests

* clang format

* fixed a bug error

* tmp code backup

* clang format

* fixed an onnx verify unit test

* add more unit tests

* clang format

* reverse a change

* fix cppcheck error

* fix cppcheck error

* fix to print all instructions in program execution

* clang format

* fix bugs related to memory coloring and offload copy to be true

* clang format

* remove unnecessary include header file

* sort test cases in ref_cpu_ops alphabetically

* clang format

* add a flag to disable cpu target in verification test

* change the way to disable some tests

* clang format

* disable verify unit test of the if operators

* add a function call to have more code coverage

* fix a build error

* fix review comments

* fix review comments

* clang format

* add a api gpu unit test for more code coverage

* clang format

* change to use instruction.size() as node index

* move the calc_implicit_deps function to module class as a member function

* clang format

* move the offload_copy flag setting to lowering

* clang format

* assign the module_eval lambda function to a variable to simplify code

* clang format

* move the compute function from ref/gpu implementation to the main if operator

* clang format

* fix cpp check error

* add a unit test for more code coverage

* clang format

* add unit test to calculate implicit deps

* add a python unit test

* clang format

* refine a unit test to have more code coverage

* clang format

* chang the way of wrap up arguments for sub modules

* clang format

* fix some build errors

* code cleanup

* refine unit tests to have more code coverage

* clang format

* refine unit test to have more code coverage

* code backup

* clang format

* add memory coloring test

* refine memory coloring unit test

* clang format

* remove an unnecessary line

* remove an unused line

* remove an unnecessary parameter in the lambda function

* clang format

* refine a unit test

* remove an unnecessary line

* refine unit tests to have more code coverage

* clang format

* combine two lines

* add one more unit test for more code coverage

* clang format

* add one more unit test

* clang format

* fix review comments

* refine a print out information

* fix review comments

* clang format

* change the sync copy to using a gpu device sync

* clang format

* remove unnecessary code
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

41c0487b

26 Mar, 2021 1 commit

add a flag to disable cpu target in some verification unit tests (#778) · 5d601ad1

Shucai Xiao authored Mar 26, 2021



* add a flag to disable cpu target in verification test

* change the way to disable some tests

* clang format

* add a function call to have more code coverage

* fix a build error

* fix review comments

* fix review comments

* clang format
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

5d601ad1

26 Feb, 2021 2 commits

changes for not operator (#735) · ebf8bd20

Cagri Eryilmaz authored Feb 26, 2021



* changes for not operator

* changed name of the op from unary_not to not

* Added tests for op and onnx parsing

* reordering not_test in onnx_test.cpp

* not operator -- gpu implementation

* added bool test for not operator

* Added test and missing links for not operator on GPU

* typo fix

* adding .onnx test files for not operator

* formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

ebf8bd20

Add more supported operators and optimizations for the cpu backend (#746) · a0b570b2

Paul Fultz II authored Feb 26, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a0b570b2

08 Feb, 2021 1 commit

Add a pass to remove unsupported data types (#738) · 3d24a21c

Paul Fultz II authored Feb 07, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

3d24a21c

06 Feb, 2021 1 commit

Log softmax nonstd input shape (#740) · 5d0ca2a6

Shucai Xiao authored Feb 06, 2021



* fix a bug that softmax/logsoftmax cannot handle nonstd input shape

* clang format

* fix review comments

* clang format

* refine test to have more code coverage

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

5d0ca2a6

19 Jan, 2021 1 commit

Logical ops (#718) · 4d46cbdb

Shucai Xiao authored Jan 19, 2021

* add the and operator

* clang format

* add unit tests for the and operator

* clang format

* change the and name to logical_and and add the logical_or, logical_xor

* clang format

* add onnx unit tests for or and xor

* add more unit tests

4d46cbdb

08 Jan, 2021 1 commit

Revamp CI infrastucture (#706) · ceb4ca09

Paul Fultz II authored Jan 08, 2021



* Add build and test github workflow

* Fix cget command

* Remove def-requirements.txt

* Add tmate session to debug workflow

* Run tmate session after installing dependencies

* Print date periodically

* Add clang tidy action

* Seperate build and run container in two different jobs

* Run bash script

* Remove interactive flag

* Try to mount the files

* Try to use the github workspace

* WIthout double braces

* Use env variable

* Pipe bash script in

* Run using hip-clang

* Use correct path

* Add verbose

* Remove j flag

* Only run for onnx file to debug

* Manually run clang-tidy

* Remove quiet flag

* Print header file

* Printout environment

* Remove extra defines

* Remove fixits and config flag

* Show ldd

* Add tmate session

* Run onnx protobuf first

* Generate proto for tensorflow

* Update cppcheck version

* Fix some cppcheck issues

* Add const

* Cppcheck fixes

* Formatting

* Fix more cppcheck issues

* Run two jobs

* Cache analysis and run format checking

* Fix yaml issues

* Fix yaml issues

* Fix indentation

* Switch to hip-clang for main docker file

* Use hip-clang in the readme

* Fixes for jenkins

* Use ccache to build

* Combine file

* Set restore keys

* Change stage name

* Build with ccache

* Add missing dependency for ccache

* Build debug with codecov

* Fix workflow syntax

* Fix list

* Use quotes

* Got to correct build path

* Install lcov

* Use sudo

* Echo all commands

* Setup tmate

* Add verbose output

* Build with cmake directly

* Add pthread flag

* Remove python config

* Continue on error

* Use on or off for cmake flag

* Use always upload cache

* Verbose output

* Verbose output from build

* Build one target

* Reduce debug symbols

* Increase garbage collection

* Remove dmesg

* Increase it to 20

* Update rocm cmake version

* Remove jobs from jenkins

* Run on all 3 ubuntus

* Remove gcc 5 jobs

* Dont add flag on 16.04

* Only upload coverage on 18.04

* Dont build for ubuntu 20.04

* Use matrix.os

* Use O2 for hip-clang since lower optimizations are broken

* Use rocm 3.0

* Pass ccache as cmake variable instead of env variable

* Build miopen from source

* Show ccache statistics

* Print log information

* Set compression level

* Use hash dir

* Set hashdir

* Install clang ocl from system

* Up compression level

* Add locale

* Increase cache size to 1G

* Lower compression level to 9

* Remove split dwarf

* Remove Og

* Add back Og

* Seperate debug and codecov

* Add missing backlash

* Garbage collect more often

* Add missing locales package

* Use Os

* Install onednn in docker and run tests

* Include target headers in tests

* Increase timeout

* Remove if condtion

* Make flag public

* Suppress memory leaks in onednn

* Use equal

* Add gh annotations

* Update rocm-cmake version

* Add ldconfig
Co-authored-by: Shucai Xiao <shucai@gmail.com>

ceb4ca09

06 Jan, 2021 1 commit

Module impl (#678) · c9b86f1c

Shucai Xiao authored Jan 06, 2021



* add an api get_main_module

* clang format

* modify onnx unit test for module

* clang format

* refactor ops unit test with the get_main_module

* clang format

* code backup

* clang format

* refine module c api

* add python api for module

* clang format

* fix a python api issue

* clang format

* fix cppcheck error

* clang format

* refine unit tests changes

* clang format

* code backup

* code backup

* clang format

* defer some changes to later PRs

* change return of get_main_module from ref to pointer

* clang format

* add unit tests for the get_main_module_api

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* clang format

* add more unit tests for more code change coverage

* clang format

* fixed a unit test error

* clang format

* fix unit test

* clang format

* code backup

* code change for more code coverage

* change program to module in various passes and matcher

* clang format

* modify the pass API

* code backup

* code backup

* clang format

* code backup

* clang format

* Add option to no generate a destroy method

* Formatting

* fix some review comments

* clang format

* fix review comments

* clang format

* clang format

* code backup

* code backup

* clang format

* fix cppcheck errors

* clang format

* clang format

* fix build errors

* clang format

* modify gpu unit tests to using module

* clang format

* fix cppcheck error

* clang format

* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Formatting

* fix review comments

* code backup

* clang format

* code backup

* clang format

* fix a bug related to a unit test

* clang format

* clang format

* fix a build error

* remove unnecessary code

* remove unnecessary files

* code backup

* clang format

* remove the compile function from the module class

* clang format

* clang format

* remove the context parameter from the from_value method of the module class

* code refinement

* clang format

* merge changes from develop branch

* clang format

* fix cppcheck error

* clang format

* fix a build error

* fixed a merge error

* fix cppcheck error

* fixed review comments

* clang format

* fix cppcheck error

* fix a cppcheck error

* fix cppcheck error

* fix build error caused by merge

* Add missing has_op function

* Formatting

* merge changes from develop branch

* fix a cppcheck error

* fixed some review comments

* clang format

* remove the begin/end function of the program class

* clang format

* refine code and fix cppcheck error

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* add unit tests for more code coverage

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* fix a build error in debug mode

* clang format
Co-authored-by: Paul <pfultz2@yahoo.com>

c9b86f1c

14 Dec, 2020 1 commit

Use dnnl for cpu backend (#688) · 406afeb8

Paul Fultz II authored Dec 14, 2020



* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Add onednn

* Formatting

* Formatting

* Add dnnl header

* Formatting

* Rewrite rnn first

* Formatting

* Call reference implementation

* Formatting

* Make literal data shared

* Formatting

* Add convolution

* Formatting

* Compensate for dilation

* Formatting

* Use name/make_op instead

* Formatting

* Rename gemm header

* Formatting

* Add dnnl convolution/gemm operators

* Formatting

* Add eliminate_contiguous

* Add faster pointwise operators

* Formatting

* Formatting

* Formatting

* Add dnnl op class

* Formatting

* Add add op

* Formatting

* Add concat operator

* Formatting

* Add more ops

* Create descriptor during finalization

* Formatting

* Dont rewrite pooling

* Enable memory coloring

* Formatting

* Add output aliases

* Formatting

* Fix errors

* Formatting

* Convert literals

* Add missing file

* Remove batch_norm

* Formatting

* Use strides

* Formatting

* Add some debug checks

* Formatting

* Fix big in adjusting shape for gemm

* Formatting

* Fix fallback dot operator

* Zero initialize buffers

* Add suport for group convolutions

* Formatting

* Make adjust allocation target independent

* Formatting

* Enable adjust_allocation for gpu/cpu

* Formatting

* Add copy to allocation model

* Formatting

* Add copy operator

* Formatting

* Better handling of output parameters in adjust_allocation

* Formatting

* Build with dnnl

* Make dnnl required

* Fix compile error

* Tidy fixes

* Formatting

* Tidy fixes

* Formatting

* Fix more tidy issues

* Formatting

* Add mul op

* Add mul op

* Set c compiler to clang as well

* Compensate for normalized compute shape

* Formatting

* Fix cppcheck errors

* Formatting

* Add onednn library to hcc

* Guard clang pragmas

* Disable cpu mode for gcc for now

* Leave it enabled it for gcc 7

* Fix cppcheck suppresion

* Fix compile error on gcc 5

* Remove unused code
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

406afeb8

09 Dec, 2020 1 commit
- Reduce operators.hpp includes · b10c76e6
  Paul authored Dec 09, 2020
  
  b10c76e6
08 Dec, 2020 1 commit

Refactor to use make_op almost everywhere (#696) · 8d21fdc9

Paul Fultz II authored Dec 08, 2020

* Load op when serializing

* Formatting

* Add missing clip field

* Use make_op almost everywhere

* Formatting

* More make ops for rnns

* Get rid of spaces

* Formatting

* Remove operators headers

* Formatting

* Remove unused op headers

* Increase line threshold

8d21fdc9

20 Nov, 2020 1 commit

Fuse skip layernorm (#683) · 1bfb147d

Paul Fultz II authored Nov 20, 2020



* Unify the vectorized and non-vectorized path

* Formatting

* Make fusion easily extendable

* Add skip layernorm fusion

* Formatting

* Call correct layernorm function

* Fix compile errors

* Add DCE

* Add test for skip layernorm

* Formatting

* Remove unused typedef

* Formatting

* Fix tidy issues

* Formatting
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>

1bfb147d

11 Nov, 2020 1 commit

Refactor program to module (#684) · 2466dd6f

Shucai Xiao authored Nov 11, 2020



* code backup

* clang format

* change corresponding tool files

* clang format
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2466dd6f

10 Nov, 2020 1 commit

Add flag to enable cpu backend (#680) · d39e51ed

Paul Fultz II authored Nov 10, 2020

* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Formatting

* Enable cpu backend for gcc builds

d39e51ed

04 Nov, 2020 1 commit

Split cpu and reference implementation (#671) · 500d9441

Paul Fultz II authored Nov 04, 2020



* Add all_targets cmake target

* Rename target

* Add ref target

* Rename tests

* Refactor compiler target

* Formatting

* Verify for every target

* Formatting

* Add verify test suite

* Formatting

* Add initial test programs

* Formatting

* Add rnn tests

* Formatting

* Validate gpu

* Formatting

* Remove old gpu tests

* Fix gpu tests

* Fix ref error

* Fix tidy issues

* Formatting

* Tidy fixes

* Fix header in python api

* Rename to ref

* Use ref in verify_onnx

* Fix tidy issue

* Build with verbose on

* Fix typo

* Remove verbose

* rename some cpu prefix to ref
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>

500d9441