Commits · df032e06ed050ef1bd9ce54aba004ef3d573ff7e · gaoqiong / MIGraphX

11 Nov, 2021 1 commit

Conditionally enable pointwise fusion (#992) · 157935ff

Paul Fultz II authored Nov 10, 2021

This enables the pointwise fusions using the MIGRAPHX_ENABLE_POINTWISE_FUSION env variable. Its disabled by default since MIOpen fusions need to be refactored.

This also adds a compile_ops pass to compile the pointwise modules. All tests except test_gpu_fast_math passes with MIGRAPHX_ENABLE_POINTWISE_FUSION=1 set.

157935ff

09 Nov, 2021 1 commit
- Move mlir to the gpu and update the test · 0ad547aa
  Paul authored Nov 09, 2021
  
  0ad547aa
28 Oct, 2021 1 commit

Roialign gpu impl (#972) · 912c8d22

Shucai Xiao authored Oct 28, 2021

GPU implementation of the roialign operator, using the jit approach to reduce the lib size.

912c8d22

08 Oct, 2021 1 commit

Nonzero op extension (#870) · 0879b5f1

Shucai Xiao authored Oct 08, 2021

This PR is for the nonzero operator with static output shape.
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0879b5f1

01 Oct, 2021 1 commit

Add multinomial op (#954) · 0b7672d7

turneram authored Oct 01, 2021

Add multinomial op to onnx parser with ref and GPU implementations.

The onnx parser inserts a literal of shape {batch_size, sample_size} with random values in the range [0, 1) and inserts existing ops to compute the cumulative density function. The multinomial operator multiplies the random values by the sum of the CDF and returns the index of the first element of the CDF that is greater than the result, representing samples randomly drawn from [0, class_size) that follow the log-probability distribution.

Resolves #821
Co-authored-by: Shucai Xiao <shucai@gmail.com>

0b7672d7

30 Sep, 2021 1 commit
- Add mlir c api · 45b4f134
  Paul authored Sep 29, 2021
  
  45b4f134
16 Sep, 2021 1 commit

Loop operator (#853) · a275f590

Shucai Xiao authored Sep 16, 2021

Add Loop operator for opset version 13.
Notes: 1) Default max iteration number is 10 if no max iteration number is provided
2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model.
3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later.
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a275f590

02 Sep, 2021 2 commits

Refactor where op (#918) · ebbaf8fc

turneram authored Sep 02, 2021

Implement the Where operator for the CPU and GPU.  This is for better performance.

ebbaf8fc

Topk op (#877) · 521b57a2

Shucai Xiao authored Sep 01, 2021



* add topk operator doe ref, cpu and gpu
* Hash modules for quicker lookup of modules
* add onnx unit test
* add unit tests for the topk operator
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

521b57a2

01 Sep, 2021 1 commit

Adjust HIP_COMPILER_FLAGS to support <$:$<>:> and SHELL: tags (#933) · 33a17257

Chris Austen authored Sep 01, 2021



In ROCm 4.5.0 hip compile flags are coming in differently.  This has
caused some parsing issues for the HIP_COMPILER_FLAGS variable.  As an example

    ROCm 4.3.0: --offload-arch=gfx900
    ROCm 4.5.0: <$<COMPILE_LANGUAGE:CXX>:SHELL:--offload-arch=gfx900>

Using existing code...
    $<$<COMPILE_LANGUAGE:CXX>:SHELL:--offload-arch=gfx900>
Becomes...
    $<$<COMPILE_LANGUAGE:CXX>:SHELL:

There are two problems with that.
  1) The "<" is not balanced with a "> due to the regex consuming the ">"
  2) There is still a `SHELL:`  label.

This commit repairs both.  I took the regex parsing code from ROCmSoftwarePlatform/MIOpen/blame/develop/CMakeLists.txt
but improved it to support handling of target features like
<$<COMPILE_LANGUAGE:CXX>:SHELL:--offload-arch=gfx900:xxx+>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

33a17257

10 Aug, 2021 1 commit

Add option to compile with hiprtc (#892) · 91c9ebbc

Paul Fultz II authored Aug 10, 2021

* Add hiprtc compile option
* Add cross compile test
* Update error reporting
* Add tests for errors and warnings
* Fix tidy warning
* Add comment to ifdefs
* Skip null character at end of log
* Assert there is null at the end

91c9ebbc

05 Aug, 2021 1 commit

Add gpu driver and improvements to pointwise codegen (#851) · 29fa2666

Paul Fultz II authored Aug 05, 2021



* Add method to compile pointwise

* Formatting

* Add lambda

* Add semicolon

* Rename variable

* Add driver to run jit kernels

* Formatting

* Add context

* Formatting

* Make seperate driver folder

* Add more general gpu driver

* Formatting

* Print out wll time

* Formatting

* Run multiple times and skip first run

* Formatting

* Seperate time_op

* Run an op for comparison

* Formatting

* Add debug asserts

* Formatting

* Change parameer name

* Formatting

* Fix argument order

* Formatting

* Add preloading

* Formatting

* Allow a different data type

* Formatting

* Pipeline transformations

* Formatting

* Add vectorization

* Formatting

* Reduce dims

* Formatting

* Compile with launch params as constant

* Formatting

* Make sure buffer can be vecotrized

* Formatting

* Enable vectorization and preloading

* Formatting

* Add print header

* Formatting

* Avoid allocating to large of LDS

* Formatting

* Add some vec functions to a seperate header

* Formatting

* Add stride loops

* Formatting

* Improve the transform pipeline

* Formatting

* Add const

* Fix shape check

* Formatting

* Just check stride axis is zero

* Remove extra finc_vector_axis overload

* Simplify some mroe functions

* Formatting

* Remove some more extra functions

* Formatting

* Simplify more decltypes

* Add another const

* Fix test

* Get buffer pointer different for older compilers
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

29fa2666

14 Jul, 2021 1 commit

Use the same device name function in the unit tests (#881) · 0b04fc80

Paul Fultz II authored Jul 14, 2021



* Unify device_name function

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0b04fc80

08 Jul, 2021 2 commits

Add inclusive scan on the GPU (#872) · 6ba279cc

Paul Fultz II authored Jul 08, 2021



* Add initial scan operator

* Formatting

* Fix with a working test

* Fix bugs

* Formatting

* Formatting

* Simplify

* Formatting

* Use non-power of 2 for test

* Make pointer
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

6ba279cc

Preallocate parameters on the CPU and unify preallocations (#840) · 427fc25c

Paul Fultz II authored Jul 08, 2021



* Add preallocate method

* Add preallocate_param pass

* Preallocate buffers on the cpu

* Formatting

* Preallocate on the gpu

* Add missing cpp file

* Formatting

* Add lifetime function

* Formatting

* Always allocate

* Fix tidy warning

* Add const

* Add missing lifetime annotations
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

427fc25c

25 Jun, 2021 1 commit
- gpu implementation of the scatter operator · 973cafd4
  Shucai Xiao authored Jun 25, 2021
  
  973cafd4
08 Jun, 2021 1 commit

Reverse Op (#846) · 9c54fc4f

Cagri Eryilmaz authored Jun 08, 2021



* init reverseOp branch: ref op + ref test. WIP

* first passing basic test

* cleanup

* additional axis implementation

* additional test

* ref op implementation vec to int for axis

* ref op test change for axis

* initial gpu files and test

* updates to implementation and test

* fixed some issues

* clang format

* cleanup

* formatting

* removing comments

* remove local size, back to default

* update tests: replace with std functions

* multiple axis for reverse op

* fix a build error

* clang format

* more tests

* fix a bug for the reverse device function

* clang format

* fix a bug

* clang format

* ref test updates, multiaxis

* formatting
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9c54fc4f

29 Apr, 2021 1 commit

MLIR MIOpen Dialect integration (phase 1) (#768) (#769) · 56584fa2

SJW authored Apr 29, 2021



* MLIR MIOpen Dialect integration (phase 1) (#768)

* Added Findmlir.cmake (using environment variables to import)

* Added mlir_conv pass to GPU target

  * Apply to any gpu::convolution if supported by MLIR

  * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution

  * Capture binary in dictionary for matching convolutions

  * Build a code_object_op with the binary and execution dimensions

  * Substitute for the gpu::convolution

* Changed the parameters for the code_object to reflect the generated MLIR kernel

* Expanded out MemRefDescriptor fields in param list

* Also updated for MLIR C-API changes

* * fixed global_size calculation

* MLIR MIOpen Dialect integration (phase 1) (#768)

* Added Findmlir.cmake (using environment variables to import)

* Added mlir_conv pass to GPU target

  * Apply to any gpu::convolution if supported by MLIR

  * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution

  * Capture binary in dictionary for matching convolutions

  * Build a code_object_op with the binary and execution dimensions

  * Substitute for the gpu::convolution

* Changed the parameters for the code_object to reflect the generated MLIR kernel

* Expanded out MemRefDescriptor fields in param list

* Also updated for MLIR C-API changes

* * Added command line option: --enable_mlir

* * fixed command line switch

* updated for new MLIR API changes

* * Added cget llvm-project-mlir to import MIIR API libraries into Dockerfile
  * removed cmake Findmlir

* updated for changes in MIIR C-API

* * updated CMakeLists.txt to allow disable of MLIR import

* fixed memory leaks and removed copies

* updated for 5D memrefs

* * formatting

* * fixed review comments

* * fixed merge issues

* hip gcnDeviceName now includes specifiers at the end
  * use major/minor values instead

* * disable MLIR by default

* * removed command-line switch --enable-mlir

* * fix unused when MLIR disabled

* * enable jenkins enable/test MLIR

* * format

* * fixed clang-tidy

* * added new type
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

56584fa2

27 Apr, 2021 1 commit
- Use C++17 to build everything when using clang (#812) · 8fcb7409
  Paul Fultz II authored Apr 27, 2021
  
  8fcb7409
09 Apr, 2021 1 commit

Upgrade docker to rocm 4.1 and drop hcc (#795) · 6d937d80

Paul Fultz II authored Apr 09, 2021

* Fix tidy warnings for 4.1

* Formatting

* Upgrade to 4.1 in docker

* Remove hcc build and enable ubsan on clang debug

* Add missing openmp package

* Construct directly

* Construct directly

* Upgrade rocm-cmake version

6d937d80

05 Apr, 2021 1 commit

Module build exec (#765) · 41c0487b

Shucai Xiao authored Apr 05, 2021



* code cleanup

* clang format

* backup code

* clang format

* remove unnecessary code

* clang format

* add module print function

* code backup

* refine the module::print function

* refine the module:to_value() function

* code backup

* backup code changes

* code backup

* remove to_value and from_value function from the module class

* rename a function

* rename the if operator

* refine the if operator

* refine the print function of module and program

* code backup

* code backup

* fix a build warning

* fix overload of compute_shape function

* code backup

* fix unit test error

* fix cppcheck error

* fix the issue related to the overload of compute_shape

* fix review comments

* fix cppcheck error

* change the return name of if_op to be if

* clang format

* fix two unit tests

* clang format

* rename variables

* clang format

* remove the unused compute_op function

* clang format

* add lowering of if operator and compute_op function

* clang format

* add parsing if operator in onnx file

* clang format

* fix clang tidy format

* clang format

* add the gpu implementation of the if operator

* enhance the validate function and uncomment a unit test

* clang format

* remove unnecessary code

* add sub_module processing in ref passes

* clang format

* clang format

* fix a hang issue related to the valid function

* fix an issue in replace_refs

* clang format

* fix review comments

* clang format

* fix cppcheck error

* clang format

* add a unit test for more code coverage

* clang format

* fix review comments and add test for more code coverage

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* fix a cppcheck error

* clang format

* backup code

* clang format

* fix cppcheck error

* clang format

* some code refinement

* clang format

* code backup to handle submodules in module compilation

* clang format

* code backup

* clang format

* code backup

* clang format

* fix a bug related to literal id

* fix a bug in gpu execution

* change the way of compiling a graph

* clang format

* backup more changes

* clang format

* refine pass log information

* remove unnecessary code

* clang format

* temp changes backup

* clang format

* add module name prefix to scratch memory id in hip_memory_allocation

* clang format

* change to copy the cond input by inserting a copy instruction

* clang format

* change to use the if output argument as the submodule output so can remove a gpu_copy

* clang format

* consider submodule in some compile passes

* clang format

* fix review comments

* clang format

* fix issues related to scratch memory

* clang format

* remove unnecessary code

* fix cppcheck error

* clang format

* reslove the implicit dependencies issue related to submodule

* clang format

* fix cppcheck error

* clang format

* backup temp changes

* clang format

* fixed an bug in the has_instruction function

* clang format

* fix the return value of the gpu implementation of the if operator

* fix a bug in the compute_shape function in the gpu implementation

* add an if onnx unit test

* clang format

* add more unit tests

* clang format

* tmp code backup

* clang format

* fix a sync problem related to copy cond argument from gpu to cpu

* clang format

* change the compile offload copy flag setting

* clang format

* enable copy from cpu to be able to do synchronous copy

* clang format

* add more unit tests

* add more unit tests

* add more ref unit tests

* clang format

* fixed a bug error

* tmp code backup

* clang format

* fixed an onnx verify unit test

* add more unit tests

* clang format

* reverse a change

* fix cppcheck error

* fix cppcheck error

* fix to print all instructions in program execution

* clang format

* fix bugs related to memory coloring and offload copy to be true

* clang format

* remove unnecessary include header file

* sort test cases in ref_cpu_ops alphabetically

* clang format

* add a flag to disable cpu target in verification test

* change the way to disable some tests

* clang format

* disable verify unit test of the if operators

* add a function call to have more code coverage

* fix a build error

* fix review comments

* fix review comments

* clang format

* add a api gpu unit test for more code coverage

* clang format

* change to use instruction.size() as node index

* move the calc_implicit_deps function to module class as a member function

* clang format

* move the offload_copy flag setting to lowering

* clang format

* assign the module_eval lambda function to a variable to simplify code

* clang format

* move the compute function from ref/gpu implementation to the main if operator

* clang format

* fix cpp check error

* add a unit test for more code coverage

* clang format

* add unit test to calculate implicit deps

* add a python unit test

* clang format

* refine a unit test to have more code coverage

* clang format

* chang the way of wrap up arguments for sub modules

* clang format

* fix some build errors

* code cleanup

* refine unit tests to have more code coverage

* clang format

* refine unit test to have more code coverage

* code backup

* clang format

* add memory coloring test

* refine memory coloring unit test

* clang format

* remove an unnecessary line

* remove an unused line

* remove an unnecessary parameter in the lambda function

* clang format

* refine a unit test

* remove an unnecessary line

* refine unit tests to have more code coverage

* clang format

* combine two lines

* add one more unit test for more code coverage

* clang format

* add one more unit test

* clang format

* fix review comments

* refine a print out information

* fix review comments

* clang format

* change the sync copy to using a gpu device sync

* clang format

* remove unnecessary code
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

41c0487b

26 Mar, 2021 1 commit

Add initial code generation (#762) · 581d31b0

Paul Fultz II authored Mar 26, 2021



* Add code object op

* Formattting

* Add more value tests

* Formatting

* Fix from_value conversion from binary

* Formatting

* Dont use offload copy

* Remove iostream header

* Fix compilation errors

* Formatting

* Rename var

* Add missing files

* Formatting

* Remove duplicate variable

* Remove comment

* Template the function so sfinae will work

* Formatting

* Use template specialization since ADL is broken on hcc

* Formatting

* Annotate the constructor with HD for hcc

* Make variable const
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

581d31b0

26 Feb, 2021 1 commit

changes for not operator (#735) · ebf8bd20

Cagri Eryilmaz authored Feb 26, 2021



* changes for not operator

* changed name of the op from unary_not to not

* Added tests for op and onnx parsing

* reordering not_test in onnx_test.cpp

* not operator -- gpu implementation

* added bool test for not operator

* Added test and missing links for not operator on GPU

* typo fix

* adding .onnx test files for not operator

* formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

ebf8bd20

25 Feb, 2021 1 commit

Add code object custom op (#744) · 7220dd18

Paul Fultz II authored Feb 24, 2021



* Add code object op

* Formattting

* Add more value tests

* Formatting

* Fix from_value conversion from binary

* Formatting

* Dont use offload copy

* Remove iostream header
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

7220dd18

19 Jan, 2021 1 commit

Logical ops (#718) · 4d46cbdb

Shucai Xiao authored Jan 19, 2021

* add the and operator

* clang format

* add unit tests for the and operator

* clang format

* change the and name to logical_and and add the logical_or, logical_xor

* clang format

* add onnx unit tests for or and xor

* add more unit tests

4d46cbdb

07 Jan, 2021 1 commit
- Set find mode in miopen to normal (#711) · 1c8fcfc9
  Paul Fultz II authored Jan 07, 2021
  
  1c8fcfc9
14 Dec, 2020 1 commit

Use dnnl for cpu backend (#688) · 406afeb8

Paul Fultz II authored Dec 14, 2020



* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Add onednn

* Formatting

* Formatting

* Add dnnl header

* Formatting

* Rewrite rnn first

* Formatting

* Call reference implementation

* Formatting

* Make literal data shared

* Formatting

* Add convolution

* Formatting

* Compensate for dilation

* Formatting

* Use name/make_op instead

* Formatting

* Rename gemm header

* Formatting

* Add dnnl convolution/gemm operators

* Formatting

* Add eliminate_contiguous

* Add faster pointwise operators

* Formatting

* Formatting

* Formatting

* Add dnnl op class

* Formatting

* Add add op

* Formatting

* Add concat operator

* Formatting

* Add more ops

* Create descriptor during finalization

* Formatting

* Dont rewrite pooling

* Enable memory coloring

* Formatting

* Add output aliases

* Formatting

* Fix errors

* Formatting

* Convert literals

* Add missing file

* Remove batch_norm

* Formatting

* Use strides

* Formatting

* Add some debug checks

* Formatting

* Fix big in adjusting shape for gemm

* Formatting

* Fix fallback dot operator

* Zero initialize buffers

* Add suport for group convolutions

* Formatting

* Make adjust allocation target independent

* Formatting

* Enable adjust_allocation for gpu/cpu

* Formatting

* Add copy to allocation model

* Formatting

* Add copy operator

* Formatting

* Better handling of output parameters in adjust_allocation

* Formatting

* Build with dnnl

* Make dnnl required

* Fix compile error

* Tidy fixes

* Formatting

* Tidy fixes

* Formatting

* Fix more tidy issues

* Formatting

* Add mul op

* Add mul op

* Set c compiler to clang as well

* Compensate for normalized compute shape

* Formatting

* Fix cppcheck errors

* Formatting

* Add onednn library to hcc

* Guard clang pragmas

* Disable cpu mode for gcc for now

* Leave it enabled it for gcc 7

* Fix cppcheck suppresion

* Fix compile error on gcc 5

* Remove unused code
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

406afeb8

09 Nov, 2020 1 commit

Add hip compilation (#664) · f71af72a

Paul Fultz II authored Nov 09, 2020



* Add compiler flags

* Add missing include

* Add filesystem header

* Formatting

* Add tmp_dir to run

* Formatting

* Kernel compilation and launching

* Formatting

* Seperate pack_args

* Formatting

* Add alignment tests

* Formatting

* Add compile test

* Formatting

* Complete compile test

* Formatting

* Use is_regular_file free function

* Fix is_regular_file call

* Fix tidy issues

* Fix tidy

* Fix tidy issue

* Print size in read_buffer to debug issue on jenkins

* Add hip flags before src file

* Fix reading output files

* Fix unsued variable warning

* Formatting

* Formatting

* Disable tidy check
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

f71af72a

15 Oct, 2020 1 commit

Added greater and less operators (#660) · 48ffbfa5

turneram authored Oct 15, 2020



* Added greater and less operators

* Fixed ops_test.cpp

* Set commutative to false for less, greater

* Refactored parse_equal/less/greater into parse_compare_op

* Removed unnecessary function attributes() from greater.hpp/less.hpp

* Added op_name arguments

* Removed local settings

* Formatting

* Missing comma

* Formatting

* Formatting

* Formatting

* Formatting

* Formatting

* Missing space
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

48ffbfa5

09 Oct, 2020 1 commit

Add parallel stream analysis (#629) · 1d98fbb4

Paul Fultz II authored Oct 08, 2020

* Add intial multi stream analysis

* Formatting

* Add more tests

* Formatting

* Remove comment

* Analyze streams on the gpu

* Formatting

* Fix nstream

* Formatting

* Add test for return

* Formatting

* Make sure return has a stream assignment

* Formatting

* Fix asserts and checks

* Improve error message for out-of-order sequence

* Formatting

1d98fbb4

30 Sep, 2020 1 commit

Add hip clang builds to jenkins (#651) · f28a62ea

Paul Fultz II authored Sep 30, 2020

* Make global variables const

* Tidy fixes

* Disable some lints

* Formatting

* Fix tidy const

* Formatting

* Add missing const keywords

* Formatting

* More fixes

* Fix remaining tidy issues

* Formatting

* Fix rocblas function call

* Formatting

* Fix nodiscard warnings

* Formatting

* Use named parameters

* Remove overload

* Add overload

* Remove noncps

* Use named param for node

* Add auto register header

* Use named parameters

* Refactor jenkinsfile

* Fix shadow

* Add missing body variable

* Add more const methods

* Add hip-clang docker builds

* Remove comments

* Add clang-format

* Add more const

* Formatting

* Rename stage

* Disable check

* Add another const

* Add python 2 dev packages

* Add sphinx to dockerfile

f28a62ea

27 Aug, 2020 1 commit

Bool type and equal operator (#603) · 59b80d4e

Shucai Xiao authored Aug 27, 2020



* add bool type

* code backup

* code backup

* clang format

* fix build warnings

* clang format

* add the equal operator

* add the equal operator

* clang format

* remove unnecessary code

* refine unit tests

* clang format

* fix review comments and a bug

* clang format

* additional changes

* clang format

* fix cppcheck error

* add bool type in c api

* fix cppcheck error

* fix review comments

* fix cppcheck error

* fix a build error related to gcc

* fix cppcheck error

* fix cppcheck error

* added the equal operator to register list

* add parsing boolean type

* clang format

* fix bool type issue for python output

* clang format

* add support for automatic multibroadcast of the equal operator

* additional unit tests for more code coverage

* clang format

* missing an onnx file
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

59b80d4e

18 Aug, 2020 1 commit

Paul Fultz II authored Aug 18, 2020

* Register ops for main migraphx

* Formatting

* Register cpu ops

* Formatting

* Show list of operators in the driver

* Formatting

* Simplify regiter

* Try to register gpu ops

* Fix compiler errors

* Register rest of the gpu operators

* Add some tests

* Formatting

* Fix gcc compiler warnings

* Formatting

* Fix tidy warnings

* Fix compile error

* Use correct op name

* Register layer norm

* Use const ref

* Make run const

e8be8548

14 Aug, 2020 1 commit

Layernorm onnx support (#599) · 2c5d5fee

kahmed10 authored Aug 14, 2020



* fix pad calc

* bert tf passes correctness

* formatting

* add test

* formatting

* remove comment

* add inline

* formatting

* fix order for literal

* formatting

* test no mul_add

* formatting

* debug layernorm

* debug layernorm

* manual merge

* more progress

* formatting

* remove miopen batchnorm

* remove headers

* Fix compile error with no dpp reductions

* fix indices

* formatting

* change matcher

* formatting

* remove binds

* formatting

* disable tf matcher

* formatting

* use fast div

* formatting

* fix matcher

* formatting

* remove comment

* move find_matches

* add assert

* formatting

* fix deepcode issue
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

2c5d5fee

13 Aug, 2020 1 commit

integrate onnx backend test suit to migraphx (#574) · d612e976

Shucai Xiao authored Aug 13, 2020



* initial progress

* formatting

* add pooling changes

* formatting

* change eliminate_pad

* formatting

* rename var

* fomratting

* update op shape test and compute

* formatting

* revert conv constructor

* formatting

* change initializer

* formatting

* fix tidy

* change quant conv and shape check

* add tests and fixes

* formatting

* fix type

* fix conv test

* formatting

* add pooling and bn tests

* formatting

* add inconsistent attr tests

* fix padding issue

* formatting

* progress on 1d to 2d

* formatting

* change compute and compile functions

* formatting

* fix duplicate

* fix conflict

* fix issue with 1d conv

* formatting

* add check for 3d limit

* rename function

* formatting

* update to MIOPen 2.3

* add support for nd pooling

* formatting

* test miopen 2.4

* change function name

* rename functions

* formatting

* add op_shape test

* add gpu ops tests

* formatting

* initial progress

* formatting

* add pkg-config

* add to support asymmetric padding of averagepool

* clang format

* fix bug for average pooling

* clang format

* fix a bug

* add unit tests for the asymmetric padding of averagepool

* clang format

* change functions

* formatting

* additional code refinement

* clang format

* check existing tests

* formatting

* change to copy_backward

* formatting

* change for loop to transform

* formatting

* add tests

* formatting

* remove comment

* add more tests

* remove an optimization for pooling

* clang format

* add and fix unit tests

* clang format

* update gpu miopen calls

* formatting

* initial progress

* add cpu impl and tests

* formatting

* add NOLINT

* add 3d test

* formatting

* add more op_shape tests

* test diff miopen version

* add submodule onnx

* add pooling shape tests

* fix error msg

* add onnx_test_backend

* reorganize python code

* temp disable test

* fix cppcheck error

* fix cppcheck error

* code backup

* add support device choice

* refine onnx backend test

* revert to miopen 2.4

* fix review comments

* fix review comments

* clang format

* fixed review comments

* clang format

* fix cppcheck error

* copy onnx_backend_test to dest when building

* add testdata folder

* fix bounds

* formatting

* code backup

* code backup

* remove unnecessary file

* fix various bugs

* remove unnecessary changes

* remove unnecessary submodule

* remove unnecessary lines

* fix algorithm

* formatting

* refine onnx backend unit tests

* pin numpy version

* fix build issue

* fixed a filename to be copied

* add the onnx dependency in docker image

* ensure results are copied back correctly

* specify onnx version

* update excluded tests

* remove unnecessary log info

* turn on more unit tests

* restrict onnx backend test to python 3.x

* clang format

* refine retrieving the input parameters

* clang format

* fix program input parameter names

* clang format

* avoid running onnx test in python 2.x

* fix cppcheck error

* fix python2.7 backend unit tests error

* clang format

* resolve the issue of ensure data copy to be completed

* clang format

* fix review comments

* fix onnx backend unit test error

* another change to make onnx backend test pass

* clang format

* fix onnx backend test error

* clang format

* disable onnx backend test to try

* build try

* update Dockerfile to try onnx backend test

* remove unnecessary code

* fix a bug in copying program

* clang format

* update dockerfile to include onnx

* fix review comments

* add the pytest module to the container

* exclude real model to avoid to be downloaded

* resolve the sync device for data copy from gpu to cpu

* clang format

* fix review comments

* clang format

* move sync_device after memory_coloring
Co-authored-by: Khalique <15948690+kahmed10@users.noreply.github.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

d612e976

12 Aug, 2020 1 commit

Add dimensionality reduction to device functions (#587) · 63563da2

Paul Fultz II authored Aug 12, 2020



* Add reduce dims

* Formatting

* Reduce dims on the gpu

* Formatting

* Fix tidy issues

* Convert to assert

* Reduce dims for contiguous

* Formatting

* Remove move

* Fix arguments used

* Formatting

* Fix warnings

* Formatting
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>

63563da2

02 Jun, 2020 1 commit
- Disable cuda compat warning · 7f553e51
  Paul authored Jun 01, 2020
  
  7f553e51
20 May, 2020 1 commit

Rnn variable seq lengths (#517) · 90200619

Shucai Xiao authored May 19, 2020



* code backup

* clang format

* fix compiling errors

* clang format

* rename a few files

* rename a few files

* fix variable bugs

* clang format

* add an operator to shift input sequences

* clang format

* fixed a bug

* clang format

* fixed a bug

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* refine code related lstm operator optimization

* clang format

* fix various bugs

* clang format

* fixed a bug in rewrite_lstm

* clang format

* fixed another bug

* refine two operator names

* clang format

* refine file names

* fix cppcheck error

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* fixed review comments

* clang format

* add unit tests

* clang format

* add unit tests

* clang format

* refine unit tests for better coverage

* clang format

* fixed a bug

* fix cppcheck error

* fix review comments

* clang format

* rename two operators according to review comments

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* fix review comments

* fix a cppcheck error

* clang format

* fix review comments

* clang format
Co-authored-by: Shucai Xiao <scxiao@prj47-rack-99.local.lan>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

90200619

15 May, 2020 1 commit

Add gelu optimization (#521) · 0079028a

kahmed10 authored May 15, 2020



* fix pad calc

* bert tf passes correctness

* formatting

* add test

* formatting

* remove comment

* add inline

* formatting

* fix order for literal

* formatting

* add test for gelu

* formatting

* added add_gelu fusion

* add files

* formatting

* remove layernorm opt

* revert reduce file

* add gelu_fn and tests

* formatting

* fix matcher, remove extra tests

* formatting

* fix matcher

* add used_once

* formatting

* start on new gelu

* formatting

* add matchers in fuse_ops

* formatting

* add dce to fix add_gelu

* add simplify_rsqrt and test

* formatting

* debugging value for matcher

* formatting

* add more to matchers

* formatting

* fix errors

* remove onnx gen

* add any_arg, change matchers to use either_arg

* formatting

* formatting

* add used_once

* formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0079028a

04 May, 2020 1 commit
- Add option -fhip-lambda-host-device for HIP-Clang (#518) · caee500e
  Yaxun (Sam) Liu authored May 04, 2020
  
  caee500e