Commits · e5242676aa7f1e246a3c71f10e21f9bd85feab3d · gaoqiong / MIGraphX

25 Feb, 2022 1 commit
- Add get_queue to context to get the current stream (#1097) · e5242676
  Paul Fultz II authored Feb 24, 2022
```
wrapped in a any_ptr class so the type can be checked at runtime for a mismatch.
```
  e5242676
24 Feb, 2022 1 commit

Some cmake fixes and updates (#1088) · cd0a4aa5

Paul Fultz II authored Feb 23, 2022

Make doc/CMakeLists.txt standalone
Switch to use rocm-cmake modules for document generation
Add CONFIGURE_DEPENDS to file(GLOB) so it will update without an explicit cmake run
Add STRINGS property for build type to make it easier to switch build types with ccmake
Various fixes and improvements

cd0a4aa5

23 Feb, 2022 1 commit

Keep std shape (#1059) · 98dfdf15

Shucai Xiao authored Feb 23, 2022

This PR is the resolve two problems in the issue#999, i.e., non_standard_shape input to reshape and reduce_mean.
Three fixes:

Any operator that has a standard shape requirement will add a contiguous input for its input.
Eliminate_contiguous, when computing whether a contiguous can be removed, we should use all the updated args, not just the one that is being checked.
In two optimization in the simplify_reshape, we remove the contiguous in the reshaper name list, since eliminate_contiguous will remove the contiguous if it can be removed.
the solution is add an attribute to the operator that requires standard input shape, then in the auto_contiguous pass, add a contiguous to every input of such operators.

98dfdf15

16 Feb, 2022 2 commits
- Support nonstandard shapes for the UnSqueeze Op (#1071) · 4480eb79
  Umang Yadav authored Feb 16, 2022
```
Support nonstandard shapes like slice, broadcast and transpose for the unsqueeze op
```
  4480eb79
- Add assign_to method for C++ API (#1075) · ecb1545c
  kahmed10 authored Feb 16, 2022
  
  ecb1545c
11 Feb, 2022 1 commit
- Fix hang with CSE pass when using submodules (#1050) · 48585bad
  kahmed10 authored Feb 11, 2022
```
* add submodule test
* remove for loop
* simplify reshape test
```
  48585bad
09 Feb, 2022 2 commits
- Enable pointwise fusion by default (#1082) · c7419a9c
  Paul Fultz II authored Feb 09, 2022
```
There is now a MIGRAPHX_DISABLE_POINTWISE_FUSION to disable it
```
  c7419a9c
- Support nonstandard shapes for the Squeeze Op (#1068) · e64b773f
  Umang Yadav authored Feb 09, 2022
```
Support slice, broadcast and transpose shapes for the squeeze op.
```
  e64b773f
08 Feb, 2022 1 commit
- File ext rename (#1078) · a30ec101
  Charlie Lin authored Feb 08, 2022
```
Changed MessagePack file extensions to mxr.
```
  a30ec101
31 Jan, 2022 1 commit
- Parse upsample (#1060) · 7e7ef0b8
  Shucai Xiao authored Jan 31, 2022
```
* use the parse_resize to parse the upsample operator
```
  7e7ef0b8
28 Jan, 2022 1 commit

Add Mean op ONNX parser (#1065) · b7218806

turneram authored Jan 28, 2022

* Add mean op onnx parser and unit tests
* Refactor parse_mean to use add_broadcastable_binary_op

b7218806

27 Jan, 2022 1 commit
- Remove Standard Shape requirement for ArgOps (#1042) · 332cb710
  Umang Yadav authored Jan 27, 2022
```
allow nonstd shape for the arg ops, non-standard shapes include broadcast, slice and transpose
```
  332cb710
26 Jan, 2022 1 commit

Add HardSwish op ONNX parser (#1066) · 7477aeb8

turneram authored Jan 26, 2022

Add HardSwish to HardSigmoid parser

HardSwish formula is y = x * HardSigmoid<alpha=1/6, beta=0.5>(x)
HardSigmoid parser sets alpha to 1/6 and adds the mul instruction if op name is HardSwish

Resolves #1062

7477aeb8

21 Jan, 2022 3 commits
- GreaterOrEqual ONNX parser (#1044) · 60aa1c85
  turneram authored Jan 21, 2022
```
Add onnx parser for operator GreaterOrEqual
```
  60aa1c85
- SoftSign ONNX parser (#1046) · ebb15dd3
  turneram authored Jan 21, 2022
```
Add onnx parser and unit tests for Softsign
```
  ebb15dd3
- SoftPlus ONNX parser (#1045) · 4c90e9a3
  turneram authored Jan 20, 2022
```
* Add onnx parser and unit test
```
  4c90e9a3
20 Jan, 2022 1 commit
- Add env variable to dump tests to a file (#1041) · 51b4439f
  Paul Fultz II authored Jan 20, 2022
  
  51b4439f
17 Jan, 2022 1 commit
- Make clip a pointwise op (#1043) · b0ece214
  Paul Fultz II authored Jan 17, 2022
```
Make clip a pointwise op
```
  b0ece214
11 Jan, 2022 1 commit

HardSigmoid ONNX parser (#1040) · fc42d852

turneram authored Jan 11, 2022

Add HardSigmoid onnx parser and unit tests
Produces mathematical equivalent to ONNX operator through combination of existing pointwise ops.
Resolves #1028

fc42d852

05 Jan, 2022 1 commit
- Fix time seed bug in random sequence ops (#1027) · 594f2802
  turneram authored Jan 05, 2022
```
Fix bug caused by casting time seed to float
```
  594f2802
09 Dec, 2021 1 commit
- Fuse last instruction in fuse_pointwise (#1015) · e758d457
  Paul Fultz II authored Dec 09, 2021
```
Fuse last instruction in fuse_pointwise
This is also fixes a bug with using an invalid iterator.
```
  e758d457
02 Dec, 2021 1 commit
- Fix pointwise compile error with half sqrt (#1010) · 7b3e58a0
  Paul Fultz II authored Dec 02, 2021
```
Fix pointwise compile error with half sqrt 
```
  7b3e58a0
25 Nov, 2021 1 commit

Non std shape auto contiguous (#1001) · 2d4dcc47

Shucai Xiao authored Nov 25, 2021

Resolves a problem in parsing the ssd-10 model.

The problem is, after inserting contiguous in the auto_contiguous pass, standard output shape of some operators becomes non-standard. Then, if the next operator requires standard input shape, an exception is throw.

For example, if we pass the following model:
Input (standard shape) -> transpose (transposed) -> softmax (transposed) -> transpose (standard) -> gather.
It works fine, and no contiguous is required.

In the auto_contiguous pass, a contiguous is inserted after the first transpose. Then we need to replace the first transpose with the contiguous and recompute all shapes. When it comes to the gather operator, its input is a transposed shape, and an exception is thrown.

The solution is in the recompute_shape() function. If it is called by the auto_contiguous pass and shape of an instruction is changed, and the shape is non_standard, we do not recompute shape of its output. The reason is: since its output shape is non_standard, a contiguous op will be added after the instruction, which will recompute shape for later operators.

2d4dcc47

17 Nov, 2021 1 commit

Handle removing contiguous on operators that use modules (#1005) · 785307c3

Paul Fultz II authored Nov 17, 2021

Currently, eliminate_contiguous will never remove contiguous for operators that use module inputs due to the fact that it doesn't pass the module inputs to compute_shape.

- Update to pass the module inputs correctly to compute_shape
- Fix the overloads of compute_shape so that when passed an empty vector of module inputs it will call the overload without module inputs
- Add tests with contiguous and pointwise module function.
- Move add_pointwise function to a seperate header to reuse across different tests

785307c3

15 Nov, 2021 1 commit

Update driver's perf report to account for batch size (#1000) · 19f65e7e

kahmed10 authored Nov 15, 2021

Currently we have the option of passing in --batch to the driver to change the batch size when the model has a dynamic dim value. We can use this flag to adjust the perf report's rate.

19f65e7e

11 Nov, 2021 1 commit

Conditionally enable pointwise fusion (#992) · 157935ff

Paul Fultz II authored Nov 10, 2021

This enables the pointwise fusions using the MIGRAPHX_ENABLE_POINTWISE_FUSION env variable. Its disabled by default since MIOpen fusions need to be refactored.

This also adds a compile_ops pass to compile the pointwise modules. All tests except test_gpu_fast_math passes with MIGRAPHX_ENABLE_POINTWISE_FUSION=1 set.

157935ff

10 Nov, 2021 1 commit

Turn on gemm unit tests (#997) · 38287064

Shucai Xiao authored Nov 10, 2021

This PR is to turn on a few gemm unit test with int8 input datatype. Before rocm4.4, int8 input data type requires matrix size to be no less than 4 in rocblas implementation. Because of this limitation, we turned off a few gemm unit tests with int8 input data type.

This limitation is removed in rocm4.4, so after we upgrade to rocm4.5, we can turn on these unit tests. Also we change to unit test conv_bn_add to adding instructions to module instead of program.
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

38287064

05 Nov, 2021 1 commit
- Update Docker to ROCm 4.5 and support Navi on Jenkins (#994) · 04e17804
  kahmed10 authored Nov 05, 2021
```
Moving our Docker file from ROCm 4.3 to 4.5 
Add Navi base GPUs in to the CI infrastructure 
```
  04e17804
03 Nov, 2021 1 commit

Add tests for the DepthToSpace+Binary pointwise operations fusion (#987) · eb6abd27

Umang Yadav authored Nov 03, 2021

In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.

If there is trailing binary pointwise operator after depthToSpace then, migraphx can move binary operator before contiguous and reshape of the depthtospce.

So, it becomes reshape-->transpose-->binary_op-->contiguous-->reshape.

Explicit contiguous wouldn't be required since binary_op outputs standard shape. So, it becomes reshape-->transpose-->binary-->reshape.

simplify_reshapes already has matcher that can do this transformation. This PR adds test for cases like depthtospace +binary op.

solves #905

eb6abd27

28 Oct, 2021 3 commits

NonMaxSuppression op ref implementation (#968) · c98b22d8

Shucai Xiao authored Oct 28, 2021

This PR is the ref implementation of the nonmaxsuppression operator. It always returns the max possible output shape, which is the problem tracked in issue #948.

c98b22d8

DepthToSpace and pointwise unary operations fusion (#986) · cf0b6d6d

Umang Yadav authored Oct 28, 2021

In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape.

This PR adds matcher to find d2s + unary pointwise ops.

Application of the matcher moves the pointwise unary operation before the contiguous and reshape of the d2s.
So it becomes
reshape --> transpose --> unary --> contiguous --> reshape.

Motivation is that, later pointwise module would be created out of unary --> contiguous --> reshape. Codegen for this pointwise module can write out buffer such that explicit contiguous and reshape wouldn't be required.

This transformation is not always guaranteed to improve performance, since unary op will operate on non-standard shape. So, we would need some tuning mechanism to make decision.

#905 pending PR for binary operations.

cf0b6d6d

Roialign gpu impl (#972) · 912c8d22

Shucai Xiao authored Oct 28, 2021

GPU implementation of the roialign operator, using the jit approach to reduce the lib size.

912c8d22

20 Oct, 2021 1 commit

Roialign (#952) · d7653732

Shucai Xiao authored Oct 20, 2021

Implementation of the roialign operator. For now, we have only the ref implementation. When we run a model on the GPU, we fall back the execution to use the ref implementation.

d7653732

19 Oct, 2021 1 commit

Fusion of pointwise operators (#969) · 351007d4

Paul Fultz II authored Oct 19, 2021

Adds a pass to fuse pointwise operators into one "pointwsie" op that has a submodule which does the calculation.

351007d4

18 Oct, 2021 2 commits

Allow constructing an operation with a format string (#976) · 77164f3c

Paul Fultz II authored Oct 18, 2021

Designed to allow a user to format the values needed for the json_string: migraphx::operation("reduce_mean", "{axes : [%i, %i, %i, %i]}", axes[0], axes[1], axes[2], axes[3]) instead of needing to use string concat or stringstream

77164f3c

Remove redundant cast (#982) · a05113aa
Paul Fultz II authored Oct 18, 2021
```
Enable a cppcheck rule to catch these redundant casts in the future
```
a05113aa

15 Oct, 2021 1 commit

Enabling rocTX markers for migraphx-driver via roctx knob (#946) · 4a71ec8c

Cagri authored Oct 14, 2021



Added features:
This enables wrapping each migraphx operator with rocTX markers.
It adds new knob trace to migraphx-driver binary.

Limitation:

rocTX standalone does not output a file, it needs to be used with rocprof. Example command line:

/opt/rocm/bin/rocprof -i ./in.txt --hip-trace --roctx-trace --flush-rate 10ms --timestamp on -d cagri_out --obj-tracking on /opt/rocm/bin/migraphx-driver trace ./resnet50-v2-7.onnx --onnx --gpu
Co-authored-by: Shucai Xiao <shucai@gmail.com>

4a71ec8c

14 Oct, 2021 1 commit

SpaceToDepth operator (#979) · 6c02cd21

Umang Yadav authored Oct 14, 2021



Inverse of DepthToSpace op
Co-authored-by: Shucai Xiao <shucai@gmail.com>

6c02cd21

08 Oct, 2021 2 commits

Nonzero op extension (#870) · 0879b5f1

Shucai Xiao authored Oct 08, 2021

This PR is for the nonzero operator with static output shape.
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0879b5f1

Remove alpha and beta from `dot` and `quant_dot` (#961) · 21193e87

Umang Yadav authored Oct 08, 2021

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.

21193e87