Commits · d9578ba62ccbae472f39c19abe38272c33595029 · gaoqiong / MIGraphX

21 Sep, 2022 2 commits

Parameterize epsilon for layernorm kernel (#1367) · d9578ba6

kahmed10 authored Sep 21, 2022

This PR allows for other values of epsilon to be matched when finding layernorm. Similarly, the calculation now uses the variable for epsilon.

d9578ba6

Multibroadcast find_mul_conv (#1384) · 9a70050b

Charlie Lin authored Sep 21, 2022

Change find_mul_conv to work with multibroadcast also. Checks the strides instead of the broadcast axis.

9a70050b

19 Sep, 2022 2 commits

Improve layernorm and reductions performance (#1348) · 97a1ed2d

Paul Fultz II authored Sep 19, 2022

Compute mean and variance in same reduction
Set block size to numbers divisible by 32 instead powers of 2
Global is also set exactly instead of being divisible by block size
More exact matching of global/local can help get rid of branching/loops
Reduce vectors first before doing dpp_reduce
Explicitly vectorize array operators since the compiler doesnt always vectorize them
Still uses old for loop when its computing at compile-time since the reinterpret_cast nor the all the vector types is supported

97a1ed2d

Disabled concurrency, queue added to perf-test.yml (#1386) · 34c08db7
Chris Austen authored Sep 19, 2022

34c08db7

16 Sep, 2022 2 commits
- Fix typo for add_sigmoid (#1385) · 10f37f49
  Umang Yadav authored Sep 16, 2022
```
* fix typo for add_sigmoid
```
  10f37f49
- Update deprecated Pybind constructor (#1382) · 255fb11a
  Umang Yadav authored Sep 16, 2022
```
* remove deprecated constructor
```
  255fb11a
15 Sep, 2022 1 commit

[mlir] Replaced `find_library` with `find_package` to locate MLIR static library (#1373) · e1e36cdc

Lixun Zhang authored Sep 15, 2022

* Replaced `find_library` with `find_package` to locate MLIR static library
* Unified the include dir for headers and remove backward compatibility
* Embedded the external/include dir into the exported library

e1e36cdc

14 Sep, 2022 4 commits
- Reduce problem size of unbatched_gemm tests (#1383) · 333860ce
  turneram authored Sep 14, 2022
```
The verify tests from pr #1354 were still causing some codecov timeouts after merge. This PR further reduces the problem sizes to avoid these failures.
```
  333860ce
- Fix split_reshape for slice len of 1 (#1379) · 4b76dd0d
  Umang Yadav authored Sep 14, 2022
```
* fix slice_dim1 for case
```
  4b76dd0d
- Implement concat using jit compilation (#1356) · 7662d9c0
  Paul Fultz II authored Sep 14, 2022
```
* Implement concat using jit compilation
```
  7662d9c0
- expose underlying migraphx::argument data pointer in pybind (#1376) · 827baeec
  shivadbhavsar authored Sep 13, 2022
```
expose underlying p data inter for migraphx.argument
Update python api documentation
```
  827baeec
13 Sep, 2022 1 commit

Use rocblas_gemm_ex for batched gemms with broadcasted B (#1354) · a10a8ef1

turneram authored Sep 13, 2022

Improves performance for 4/6 GEMMs used by huggingface BERT models with batch_size>1 by using a non-batched rocBLAS call for GEMMs where the B input has a broadcasted batch dimension.
The four verify tests added reflect the actual configurations used by bert-base-cased, with varied batch sizes.

Also adds a matcher to simplify_reshapes to move multibroadcasts after concats.

a10a8ef1

09 Sep, 2022 1 commit
- Bump version to 2.4 (#1375) · d78bcdfb
  Chris Austen authored Sep 09, 2022
```
migraphx version is now 2.4
```
  d78bcdfb
08 Sep, 2022 2 commits
- Remove unused headers (#1363) · ed2c73ac
  Paul Fultz II authored Sep 08, 2022
```
* Remove unused headers
```
  ed2c73ac
- Fix TF literal parsing for relu6 (#1370) · f2667056
  Charlie Lin authored Sep 08, 2022
```
Fixes TF literal parsing for relu6.  previously always made a float type literal, breaks for float16 as an example
```
  f2667056
07 Sep, 2022 1 commit
- Fix accuracy bug when vectorizing slices (#1364) · 60aa0e48
  Paul Fultz II authored Sep 06, 2022
```
* Fix accuracy bug when vectorizing slices
```
  60aa0e48
06 Sep, 2022 1 commit
- Enable cppcheck rule for 'not', 'or' keywords (#1361) · d37a4df9
  Paul Fultz II authored Sep 06, 2022
```
Using not and or improves readability. The cppcheck rule will help ensure we are doing it consistently.
```
  d37a4df9
31 Aug, 2022 1 commit

Add pass to rewrite gelu as fast gelu (#1299) · 794a4335

turneram authored Aug 31, 2022

Rewrite_gelu pass replaces the gelu formula of x * (1/2) * (1 + erf(x/sqrt(2))) with the sigmoid approximation of x * Sigmoid(x * 1.702)

794a4335

29 Aug, 2022 1 commit

Insert contiguous for reshape as necessary (#1351) · ed7973d1

Umang Yadav authored Aug 29, 2022

reshape op requires standard shape. During simplify_algebra, it inserts reshapes without checking for this requirement.

ed7973d1

27 Aug, 2022 2 commits

Show kernel time when using gpu-driver (#1289) · 349635ce
Paul Fultz II authored Aug 27, 2022
```
* Track kernel time
```
349635ce

Improvements to handling and add constant passed to dot operator (#1280) · 8752875a

Paul Fultz II authored Aug 26, 2022

This will rewrite dot operators like X(Y + b) to XY + Xb when b is constant as we can fold the add away.
This improves handling pointwise with broadcasted operators, this helps improves const propagation.
Improve gemm fusion with a mul_add
Improve support for broadcast shapes in gemm

8752875a

26 Aug, 2022 1 commit
- Fix test suite compile in Ubuntu 22.04 (#1353) · af7f22d8
  Charlie Lin authored Aug 26, 2022
  
  af7f22d8
24 Aug, 2022 1 commit
- fix bug size_t -> std::size_t (#1350) · 1704bb04
  Charlie Lin authored Aug 24, 2022
```
declare a missing std::size
```
  1704bb04
23 Aug, 2022 1 commit

Dynamic ref NMS (#1288) · fa3c21fa

Charlie Lin authored Aug 23, 2022

Has NMS op output a dynamic shape (ONNX spec behavior)
Allows for dynamic input shape to NMS op

fa3c21fa

21 Aug, 2022 1 commit

Update is_supported (#1334) · 79e15ca9

varunsh authored Aug 21, 2022

* Update is_supported
* Return object from is_supported
* Return by reference in interator

79e15ca9

19 Aug, 2022 3 commits
- Enable tidy for fpga backend (#1347) · b691abdd
  Paul Fultz II authored Aug 19, 2022
  
  b691abdd
- Remove print (#1345) · 3c133f81
  Charlie Lin authored Aug 19, 2022
```
remove print from source
```
  3c133f81
- Fix json strings in driver models (#1341) · ac507c64
  kahmed10 authored Aug 19, 2022
```
* fix json strings in driver models
```
  ac507c64
18 Aug, 2022 1 commit

pybind updates for torch_migraphx library (#1323) · 8045f7c8

shivadbhavsar authored Aug 18, 2022

Add function argument_from_pointer to allow directly passing a migraphx.shape object and a memory address.
Expose the is_compiled() method from migraphx::program.
Expose the enum types under migraphx::op.

8045f7c8

17 Aug, 2022 3 commits
- run performance benchmarks on types (#1343) · 7c8f2690
  Chris Austen authored Aug 17, 2022
  
  7c8f2690
- Add jit layernorm fusion (#1301) · 1784584e
  Paul Fultz II authored Aug 16, 2022
  
  1784584e
- Improve horizontal fusion of contiguous (#1292) · 18e4a2c6
  Paul Fultz II authored Aug 16, 2022
```
* Horizontally fuse contiguous
```
  18e4a2c6
16 Aug, 2022 2 commits
- Fix softmax accuracy issues (#1342) · 0e17a724
  Paul Fultz II authored Aug 16, 2022
  
  0e17a724
- formatting (#1339) · cb53687e
  Umang Yadav authored Aug 16, 2022
```
Removes unnecessary semi-colon after call to MACRO
```
  cb53687e
12 Aug, 2022 2 commits

Remove prints (#1338) · bab9502a
Charlie Lin authored Aug 12, 2022

bab9502a

Enable switching to bare pointer ABI for MLIR (#1333) · 55cb7d3a

Krzysztof Drewniak authored Aug 11, 2022

Once
https://github.com/ROCmSoftwarePlatform/llvm-project-mlir/pull/690
lands, the ABI for MLIR-generated kernels will change. This commit
prepares MIGraphX for the change by conditionally selecting the new
ABI if MLIR reports a sufficiently high API version in its headers.

55cb7d3a

11 Aug, 2022 1 commit
- onnxruntime renamed master to main (#1336) · 7ecb2de4
  Chris Austen authored Aug 11, 2022
```
Change Dockerfile to use main instead of master for ORT operations
```
  7ecb2de4
09 Aug, 2022 2 commits

Allow license_stamper.py to be ran from any directory (#1332) · 5bf4dee6

Paul Fultz II authored Aug 09, 2022



* Allow license_stamper.py to be ran from any directory

* Format
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

5bf4dee6

Explicitly set rocblas_pointer_mode in examples (#1331) · b37322ae

Umang Yadav authored Aug 09, 2022



* fix rocblas pointer mode

* fix formatting

* formatting

* revert header change
Co-authored-by: umangyadav <umang.yadav@amd.com>

b37322ae

08 Aug, 2022 1 commit

Imply type of literal returned based on input protobuff for zero elem… (#1326) · bb0e04ce

Ted Themistokleous authored Aug 08, 2022

* Imply type of literal returned based on input protobuff for zero element constant values.

This saves us the default behavior as the onnx parsing assumes that every zero value is float. This way we're still grabbing relevant type information from the protobuff instead and wont fail our data type checks for if them/else blocks from onnx

* Revert "Imply type of literal returned based on input protobuff for zero element constant values."

This reverts commit 390bb853

.

* Add  test case to parse in empty constant int64 proto buffer

I think the previous test case was aliasing an issue where we default to float but need to actually read in int64 instead of int32

* fixup! Add  test case to parse in empty constant int64 proto buffer

* Add test for non empty int64 scalar

Add one item in the np array to use for the constant we're parsing in.

* Draft partial fix

* Fix test failures from previous change to read in protobuf data types correctly for empty constants.

Instead of assuming things are empty and thus we default to float, reading in the correct types broke some assumptions code was using for an empty literal.

* Fix formatting and naming

* Fix naming with var in constant_one_val_int64_test
Co-authored-by: charlie <charlie.lin@amd.com>
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

bb0e04ce