Commits · e98aecb1e8dd9fcb67467a2e0ca3c3f6150501b8 · gaoqiong / MIGraphX

"vscode:/vscode.git/clone" did not exist on "da26db342f95a136a315f9a68eb7264a2fd13ea6"

09 Nov, 2021 2 commits
- clang format · e98aecb1
  Shucai Xiao authored Nov 08, 2021
  
  e98aecb1
- fix the issue of converting standard shape to non-standard shape after inserting contiguous · 34e90169
  Shucai Xiao authored Nov 08, 2021
  
  34e90169
05 Nov, 2021 1 commit
- Update Docker to ROCm 4.5 and support Navi on Jenkins (#994) · 04e17804
  kahmed10 authored Nov 05, 2021
```
Moving our Docker file from ROCm 4.3 to 4.5 
Add Navi base GPUs in to the CI infrastructure 
```
  04e17804
03 Nov, 2021 1 commit

Add tests for the DepthToSpace+Binary pointwise operations fusion (#987) · eb6abd27

Umang Yadav authored Nov 03, 2021

In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.

If there is trailing binary pointwise operator after depthToSpace then, migraphx can move binary operator before contiguous and reshape of the depthtospce.

So, it becomes reshape-->transpose-->binary_op-->contiguous-->reshape.

Explicit contiguous wouldn't be required since binary_op outputs standard shape. So, it becomes reshape-->transpose-->binary-->reshape.

simplify_reshapes already has matcher that can do this transformation. This PR adds test for cases like depthtospace +binary op.

solves #905

eb6abd27

28 Oct, 2021 3 commits

NonMaxSuppression op ref implementation (#968) · c98b22d8

Shucai Xiao authored Oct 28, 2021

This PR is the ref implementation of the nonmaxsuppression operator. It always returns the max possible output shape, which is the problem tracked in issue #948.

c98b22d8

DepthToSpace and pointwise unary operations fusion (#986) · cf0b6d6d

Umang Yadav authored Oct 28, 2021

In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.

This PR adds matcher to find d2s + unary pointwise ops.

Application of the matcher moves the pointwise unary operation before the contiguous and reshape of the d2s.
So it becomes
reshape --> transpose --> unary --> contiguous --> reshape.

Motivation is that, later pointwise module would be created out of unary --> contiguous --> reshape. Codegen for this pointwise module can write out buffer such that explicit contiguous and reshape wouldn't be required.

This transformation is not always guaranteed to improve performance, since unary op will operate on non-standard shape. So, we would need some tuning mechanism to make decision.

#905 pending PR for binary operations.

cf0b6d6d

Roialign gpu impl (#972) · 912c8d22

Shucai Xiao authored Oct 28, 2021

GPU implementation of the roialign operator, using the jit approach to reduce the lib size.

912c8d22

20 Oct, 2021 1 commit

Roialign (#952) · d7653732

Shucai Xiao authored Oct 20, 2021

Implementation of the roialign operator. For now, we have only the ref implementation. When we run a model on the GPU, we fall back the execution to use the ref implementation.

d7653732

19 Oct, 2021 2 commits
- Link with pthreads in core migraphx library since we use threads there (#975) · 4d82d761
  Paul Fultz II authored Oct 19, 2021
```
pthread linking errors on SLES. 
```
  4d82d761
- Fusion of pointwise operators (#969) · 351007d4
  Paul Fultz II authored Oct 19, 2021
```
Adds a pass to fuse pointwise operators into one "pointwsie" op that has a submodule which does the calculation.
```
  351007d4
18 Oct, 2021 2 commits

Allow constructing an operation with a format string (#976) · 77164f3c

Paul Fultz II authored Oct 18, 2021

Designed to allow a user to format the values needed for the json_string: migraphx::operation("reduce_mean", "{axes : [%i, %i, %i, %i]}", axes[0], axes[1], axes[2], axes[3]) instead of needing to use string concat or stringstream

77164f3c

Remove redundant cast (#982) · a05113aa
Paul Fultz II authored Oct 18, 2021
```
Enable a cppcheck rule to catch these redundant casts in the future
```
a05113aa

15 Oct, 2021 1 commit

Enabling rocTX markers for migraphx-driver via roctx knob (#946) · 4a71ec8c

Cagri authored Oct 14, 2021



Added features:
This enables wrapping each migraphx operator with rocTX markers.
It adds new knob trace to migraphx-driver binary.

Limitation:

rocTX standalone does not output a file, it needs to be used with rocprof. Example command line:

/opt/rocm/bin/rocprof -i ./in.txt --hip-trace --roctx-trace --flush-rate 10ms --timestamp on -d cagri_out --obj-tracking on /opt/rocm/bin/migraphx-driver trace ./resnet50-v2-7.onnx --onnx --gpu
Co-authored-by: Shucai Xiao <shucai@gmail.com>

4a71ec8c

14 Oct, 2021 1 commit

SpaceToDepth operator (#979) · 6c02cd21

Umang Yadav authored Oct 14, 2021



Inverse of DepthToSpace op
Co-authored-by: Shucai Xiao <shucai@gmail.com>

6c02cd21

13 Oct, 2021 2 commits

Trace eval segfault (#974) · 337c5ba1

Shucai Xiao authored Oct 13, 2021

 when running a model on GPU, migraphx tries to print out content from gpu memory, which causes a segfault. The solution is to copy the gpu memory content back to CPU before the print.

337c5ba1

Bump version for ABI change (#970) · a14a4e64
Paul Fultz II authored Oct 13, 2021

a14a4e64

08 Oct, 2021 2 commits

Nonzero op extension (#870) · 0879b5f1

Shucai Xiao authored Oct 08, 2021

This PR is for the nonzero operator with static output shape.
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0879b5f1

Remove alpha and beta from `dot` and `quant_dot` (#961) · 21193e87

Umang Yadav authored Oct 08, 2021

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.

21193e87

01 Oct, 2021 2 commits

Add multinomial op (#954) · 0b7672d7

turneram authored Oct 01, 2021

Add multinomial op to onnx parser with ref and GPU implementations.

The onnx parser inserts a literal of shape {batch_size, sample_size} with random values in the range [0, 1) and inserts existing ops to compute the cumulative density function. The multinomial operator multiplies the random values by the sum of the CDF and returns the index of the first element of the CDF that is greater than the result, representing samples randomly drawn from [0, class_size) that follow the log-probability distribution.

Resolves #821
Co-authored-by: Shucai Xiao <shucai@gmail.com>

0b7672d7

Add remaining random ops for Barracuda models (#963) · ccd08b4c

turneram authored Oct 01, 2021

Add RandomNormal, RandomNormalLike, RandomUniform, and RandomUniformLike to onnx parser and onnx tests

Each pair of Random*/Random*Like is implemented using a single op_parser because the ops share the same essential attributes and algorithm with the difference that Random*Like get the output type and/or shape from an input argument and Random* take both from attributes.

Resolves #907
Resolves #959

ccd08b4c

29 Sep, 2021 1 commit
- DepthToSpace Operator Implementation (#950) · 87b2fe35
  Cagri Eryilmaz authored Sep 29, 2021
```
Supports 1,11,13 ONNX Operator Set
```
  87b2fe35
27 Sep, 2021 1 commit

Dpp opts for wavefront 32 (#951) · 6e2df9de

kahmed10 authored Sep 27, 2021

Checks wavefront size, then changes implementation and number of threads for DPP reduce

6e2df9de

23 Sep, 2021 1 commit
- Make `compile_options` an opaque object for ABI compatibility (#955) · 95431eb7
  Umang Yadav authored Sep 23, 2021
```
Add forward compatibility support for compile options 
```
  95431eb7
21 Sep, 2021 1 commit
- Add flag to bypass passes on modules (#949) · da26db34
  Paul Fultz II authored Sep 21, 2021
```
Needed to bypass passes when fusing pointwise operators into a module.
```
  da26db34
17 Sep, 2021 3 commits

Revert "Remove alpha and beta attributes from dot operator (#945)" (#957) · 985f58b0
Paul Fultz II authored Sep 17, 2021
```
This reverts commit 9e43cb8b.
```
985f58b0

Remove alpha and beta attributes from dot operator (#945) · 9e43cb8b

Umang Yadav authored Sep 17, 2021

This PR aims to remove alpha and beta attributes from dot operator completely.

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

9e43cb8b

Make `file_options` an opaque object for ABI compatibility (#953) · 31dc067e

Umang Yadav authored Sep 17, 2021



make file_options struct an opaque object for ABI compatibility, uses make generate to auto-generate and includes  modified tests.
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

31dc067e

16 Sep, 2021 1 commit

Loop operator (#853) · a275f590

Shucai Xiao authored Sep 16, 2021

Add Loop operator for opset version 13.
Notes: 1) Default max iteration number is 10 if no max iteration number is provided
2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model.
3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later.
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a275f590

10 Sep, 2021 2 commits
- Assert the shape for compute and compute_shape are the same (#936) · 8b4c69c5
  Paul Fultz II authored Sep 10, 2021
```
Assert shapes dont change
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
```
  8b4c69c5
- Add ThresholdedRelu to onnx parser (#937) · 6b6e9362
  turneram authored Sep 10, 2021
```
Add ability to parse ThresholdedRelu ONNX operator.

Resolves #888
Co-authored-by: Shucai Xiao <shucai@gmail.com>
```
  6b6e9362
07 Sep, 2021 3 commits

qdq for quantization and include subgraph (#891) · b45f7239

Shucai Xiao authored Sep 07, 2021



Add operators, refactor parsers, add rewrite passes, add tests
Add ref implementations
Move broadcasting of scales and zero points to onnx parser
Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type
fp16 and fp8 quantization to include subgraph and parameters
fix unit test to use qdq operators for int8 quantization
Co-authored-by: turneram <alturner@amd.com>

b45f7239

Improve check_context to handle submodules better (#927) · fdaa21ee
Paul Fultz II authored Sep 07, 2021
```
Improve check_context to handle submodules better
Co-authored-by: Shucai Xiao <shucai@gmail.com>
```
fdaa21ee
Allow creating modules in a module pass (#931) · ac0f79aa
Paul Fultz II authored Sep 07, 2021
```
* Add module pass manage
```
ac0f79aa

02 Sep, 2021 2 commits

Refactor where op (#918) · ebbaf8fc

turneram authored Sep 02, 2021

Implement the Where operator for the CPU and GPU.  This is for better performance.

ebbaf8fc

Topk op (#877) · 521b57a2

Shucai Xiao authored Sep 01, 2021



* add topk operator doe ref, cpu and gpu
* Hash modules for quicker lookup of modules
* add onnx unit test
* add unit tests for the topk operator
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

521b57a2

01 Sep, 2021 2 commits

Add a command to the driver to list supported onnx operators (#938) · 1f741f73
Paul Fultz II authored Sep 01, 2021
```
* Add a command to list supported onnx operators
```
1f741f73

Adjust HIP_COMPILER_FLAGS to support <$:$<>:> and SHELL: tags (#933) · 33a17257

Chris Austen authored Sep 01, 2021



In ROCm 4.5.0 hip compile flags are coming in differently.  This has
caused some parsing issues for the HIP_COMPILER_FLAGS variable.  As an example

    ROCm 4.3.0: --offload-arch=gfx900
    ROCm 4.5.0: <$<COMPILE_LANGUAGE:CXX>:SHELL:--offload-arch=gfx900>

Using existing code...
    $<$<COMPILE_LANGUAGE:CXX>:SHELL:--offload-arch=gfx900>
Becomes...
    $<$<COMPILE_LANGUAGE:CXX>:SHELL:

There are two problems with that.
  1) The "<" is not balanced with a "> due to the regex consuming the ">"
  2) There is still a `SHELL:`  label.

This commit repairs both.  I took the regex parsing code from ROCmSoftwarePlatform/MIOpen/blame/develop/CMakeLists.txt
but improved it to support handling of target features like
<$<COMPILE_LANGUAGE:CXX>:SHELL:--offload-arch=gfx900:xxx+>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

33a17257

31 Aug, 2021 3 commits

Enable constructing argument with tuple and buffer (#919) · b90d69ae

Paul Fultz II authored Aug 31, 2021



* Improve handling of constructing a tuple from a buffer
* Add unit test
* Remove unused function
Co-authored-by: Shucai Xiao <shucai@gmail.com>

b90d69ae

Changes to support both OneDNN and ZenDNN builds (#929) · 0859fe90

kahmed10 authored Aug 31, 2021



* Add preallocate method

* Add preallocate_param pass

* Preallocate buffers on the cpu

* Formatting

* Preallocate on the gpu

* Add missing cpp file

* Formatting

* Add lifetime function

* Formatting

* Improve handling of exceptions in test driver

* Formatting

* Auto print exception

* Formatting

* Fork each test case

* Formatting

* Exclude gcc 5 debug build

* Fix tidy issues

* Add color

* Formatting

* Create driver class

* Formatting

* Customize test_case names

* Formatting

* Report status from forked processes

* Formatting

* Update the verify driver

* Formatting

* Print out failed tests

* Formatting

* Fix tidy issues

* Formatting

* Expect passing

* Improve failure reporting on non-linux systems

* Fix ifdef

* Always allocate

* Fix tidy warning

* Flush code code cov

* Formatting

* Fix tidy

* Add const

* Check if weak symbols is linked

* Formatting

* initial progress

* formatting

* Add continue flag

* Formatting

* Set exe name

* Use stringstream and use quotes

* rename vars

* formatting

* more testing

* formatting

* Fix bug when using --continue in the tests

* Formatting

* revert gemm

* revert dot file

* rename var

* update cmakelists and deconv compute
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0859fe90

Fix debug assert (#930) · bd85a76c

Shucai Xiao authored Aug 31, 2021

* fix two asserts for debug build

* add unit test for copy parameters

* clang format

* add a unit test for reorder_dims

* change tranpose to always require perm not be empty

* clang format

* remove an unnecessary line

* fix tidy error

* fix review comments

bd85a76c