Commits · rocm-5.3.1 · gaoqiong / MIGraphX

06 Oct, 2022 1 commit

Add compute_fp32 flag for quant_gemm tests (#1360) (#1409) · dea0a80f

Chris Austen authored Oct 05, 2022

test_gpu_pack_int8_args fails on gfx908 machine, because it doesn't set compute_fp32 flag correctly. This PR fixes the test such that it checks for the device-name, and rocblas-versions and sets this flag accordingly.

dea0a80f

16 Sep, 2022 1 commit

Accuracy update (#1374) · 00923759

Chris Austen authored Sep 16, 2022

* Fix softmax accuracy issues (#1342)
* Fix accuracy bug when vectorizing slices (#1364)

00923759

31 Aug, 2022 1 commit

Final performance improvements for release (#1369) · a85b183b

Chris Austen authored Aug 31, 2022

* Improvements to handling and add constant passed to dot operator (#1280)
* Improve horizontal fusion of contiguous (#1292)
* Add pass to rewrite gelu as fast gelu (#1299)
* Add jit layernorm fusion (#1301)

a85b183b

26 Aug, 2022 1 commit

53merge v2 (#1357) · 9a1ada1a

Chris Austen authored Aug 26, 2022



* Fix json strings in driver models (#1341)
* fix bug size_t -> std::size_t (#1350)
* Fix test suite compile in Ubuntu 22.04 (#1353)
* onnxruntime renamed master to main (#1336)
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>
Co-authored-by: Charlie Lin <charlie.lin@amd.com>

9a1ada1a

09 Aug, 2022 2 commits

Allow license_stamper.py to be ran from any directory (#1332) · 5bf4dee6

Paul Fultz II authored Aug 09, 2022



* Allow license_stamper.py to be ran from any directory

* Format
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

5bf4dee6

Explicitly set rocblas_pointer_mode in examples (#1331) · b37322ae

Umang Yadav authored Aug 09, 2022



* fix rocblas pointer mode

* fix formatting

* formatting

* revert header change
Co-authored-by: umangyadav <umang.yadav@amd.com>

b37322ae

08 Aug, 2022 1 commit

Imply type of literal returned based on input protobuff for zero elem… (#1326) · bb0e04ce

Ted Themistokleous authored Aug 08, 2022

* Imply type of literal returned based on input protobuff for zero element constant values.

This saves us the default behavior as the onnx parsing assumes that every zero value is float. This way we're still grabbing relevant type information from the protobuff instead and wont fail our data type checks for if them/else blocks from onnx

* Revert "Imply type of literal returned based on input protobuff for zero element constant values."

This reverts commit 390bb853

.

* Add  test case to parse in empty constant int64 proto buffer

I think the previous test case was aliasing an issue where we default to float but need to actually read in int64 instead of int32

* fixup! Add  test case to parse in empty constant int64 proto buffer

* Add test for non empty int64 scalar

Add one item in the np array to use for the constant we're parsing in.

* Draft partial fix

* Fix test failures from previous change to read in protobuf data types correctly for empty constants.

Instead of assuming things are empty and thus we default to float, reading in the correct types broke some assumptions code was using for an empty literal.

* Fix formatting and naming

* Fix naming with var in constant_one_val_int64_test
Co-authored-by: charlie <charlie.lin@amd.com>
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

bb0e04ce

04 Aug, 2022 2 commits

Dynamic ref convolution op (#1224) · 67f77ac1

Charlie Lin authored Aug 04, 2022



* Dynamic shape handling in shape object

* rewrite empty lens multibroadcast test

* Shape class changes to handle dynamic
* More throw errors for functions that don't make sense for dynamic shape
* Print output changes
* Serialization changes

* Fixing serialization errors

* Remove const on dyn_dim copy getters

* Dynamic shape tests

* Fix serialize errors

* Add dyn_data struct to avoid ambiguous constructor

* Tidy fix: emplace_back() over for loop

* Tidy fix: use move

* Use std::initializer_list in constructor
Reverts the dyn_data struct change
Should get around the ambiguous braced initialization list error

* avoid typedef

* element_space, min,max,opt _lens change

* formatting

* Comments fix

* dynamic bytes() test

* Seralize and reflect changes

* formatting

* Test the dynamic lens functions

* progress

* Formatting

* Dynamic conv draft progress

* Add operator<< tests for coverage

* Coverage update

* Add to conv dynamic batch test

* Dynamic image size test

* Dynamic weight handling

* Dyn image shape test change, fix dyn weight cond

* Comment update

* Dynamic weights shape test and fix

* Use ternary operator

* Tidy fixes

* Handle dynamic graph input shapes in ONNX parser

* Formatting

* Handle dynamic shape for convolution

* formatting

* cppcheck fixes

* Add onnx test files

* Fix typo

* Disable auto_pad for dynamic input shape

* check_shapes object checks for allowing dynamic shapes

* Fix any_of

* Change to maintain const objectness

* Formatting

* Check shapes allow dynamic

* Refactor compute_shape() call into op.compute()
Allows for per operator differences with handling dynamic shape
Fix operation.hpp change to use the generator

* Comment fix

* Refactor normalize_attributes() calls to use max_lens()

* Comment addition

* Update other normalize_attributes() calls

* Change to using constructor and add tests

* Use const member function

* Add more dynamic shape support

* Add tests for error code coverage

* Fix opt shape bug and add shape tests

* capture all by ref

* Fix typo with img shape calculation

* Add more tests

* dynamic auto pad attempt
Linker error with pad_calc.cpp

* Fix parse dyn auto_pad
Should only need to use dynamic auto pad when the image shape or kernel
shape are dynamic. For a dynamic batch size, the auto pad calculation is
the same.

* Fix linking error

* Fix auto_pad bug
Fixed input tensor with auto_pad setting on

* auto_pad onnx tests

* Fix auto_pad calculation, evaluate in ref_conv
add ref_ops tests

* Add shape tests, fix bugs

* Refactor first two output dynamic len calculation

* Conv MLIR test update

* i64 MLIR test fix

* Fix MLIR test typo
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

67f77ac1

Update README.md (#1327) · 7dcae037
Umang Yadav authored Aug 03, 2022

7dcae037

02 Aug, 2022 2 commits
- Improve help and error reporting in driver (#1258) · ebdddf58
  Paul Fultz II authored Aug 02, 2022
```
* Improve type printing in driver
* Improve error with incorrect order for command
* Add spell checking of arguments
* Add validations and required checking
* Add required arguments and groups
```
  ebdddf58
- Add support for tuning db access in mlir kernel (#1307) · e2106d08
  jungpark-mlir authored Aug 02, 2022
  
  e2106d08
30 Jul, 2022 1 commit

Add accuracy checker tool (#1315) · 16cb8377

kahmed10 authored Jul 30, 2022

Added an Accuracy checker to the tools directory.  Currently compares ONNX FP32 models against ORT CPUEP

16cb8377

29 Jul, 2022 1 commit

Avoid registering host buffer ptr multiple times during hip copies (#1245) · 7596f3f1

Umang Yadav authored Jul 29, 2022

Currently, while copying a host buffer to the device, it first registers/maps the host buffer pointer to address space of the device.

If the host buffer has been allocated by the hipHostMalloc then, it is implicitly registered to the device's address space, and no need to register again. This PR adds a check for the same.

7596f3f1

27 Jul, 2022 2 commits

Add node name to debug output of PARSE_IF (#1318) · afdc3051

Ted Themistokleous authored Jul 27, 2022

Gives better clarity to which argument is throwing an error, especially in cases with nested IF statements in the network.

afdc3051

Fix literal type in the instance_norm parsing (#1317) · 4918d769

Umang Yadav authored Jul 27, 2022

instancenorm parser always creates literal of type float which would fail in type check while creating binary ops if model is fp16.

4918d769

25 Jul, 2022 3 commits

Add onnx mod operator (#1302) · 77e80b8e

Ted Themistokleous authored Jul 25, 2022

* Add in changes for onnx Mod operator

Initial operator for mod implementation and test cases for integer and floating based types.

Need to use fmod from stdlib for floating point types. half_float::half thankfully is specced to the use the existing std::fmod() call when looking at the half.hpp implementation.

fmod_flag should mirror the onnx fmod attribute. Right now using a floating point type without setting that on the user side to true will result in an exception.

Ref ticket #1283

77e80b8e

Add fpga target (#1304) · 8a30d698

varunsh authored Jul 25, 2022

* Add is_supported to the target
* Add get_target_assignments
* Rename assignment to target_assignments
* Add ref target header to test
* Add fpga target
* Make context const in compute

8a30d698

Add performance testing yamls (#1313) · 637d1a7b
Chris Austen authored Jul 24, 2022
```
* Add performance check per commit
```
637d1a7b

22 Jul, 2022 1 commit
- Improve error reporting in the API (#1274) · c722117d
  Umang Yadav authored Jul 22, 2022
```
C++ API is not printing thrown exception string. this improves on it.
```
  c722117d
21 Jul, 2022 2 commits
- Change ownership to company email (#1310) · 6e6cb994
  Chris Austen authored Jul 21, 2022
```
Remove specific person name from deb created packages and move toward a general maintainer id/email
```
  6e6cb994
- Dynamic check_shapes (#1295) · c51cf531
  Charlie Lin authored Jul 21, 2022
```
Dynamic shape handling in shape object
```
  c51cf531
19 Jul, 2022 3 commits

Fix TF parsing for creating literals and Fix name lookups for input params (#1298) · 4d59b7c7

Umang Yadav authored Jul 19, 2022

Bug 1: create_literal was using back_inserter to copy vector with already allocated size, causing double the size of literal.
Fix 1 : not use back_inserter
Bug 2: Input param to model can be from operation that has multiple output, in that case name of the input param would contain : e.g. input_1:0
Fix 2: Look for : and take substring

4d59b7c7

Dynamic dimension input onnx parser (#1249) · 5a87fcbd

Charlie Lin authored Jul 19, 2022

Depends on #1199

Adds ONNX parser functionality for dynamic input shapes.
Uses options parameter in parse_onnx()

5a87fcbd

Fix op includes (#1308) · 39b307b2

Charlie Lin authored Jul 19, 2022

Changes to operator includes:

removed some includes that were not used
included argument.hpp where clang-tidy wanted it

39b307b2

15 Jul, 2022 1 commit

Fix test case for min & max operators (#1305) · dfc91e2c

Ted Themistokleous authored Jul 15, 2022

Fix min_test.onnx generation as well as add a proper check to the parse program vs the expect program.
Adding this in to fix test converge for the min case.

dfc91e2c

12 Jul, 2022 5 commits
- Reduce header inclusion in op headers (#1271) · ba1b7850
  Paul Fultz II authored Jul 12, 2022
```
Reduce header inclusion in op headers
```
  ba1b7850
- Add tests for C API (#1266) · a7a32a9e
  Paul Fultz II authored Jul 12, 2022
```
This will ensure that migraphx.h can be included from a C compiler, and check that the C API can be called. This includes stdbool.h which is needed when using bool from C.
```
  a7a32a9e
- create the dev package (#1293) · 76022598
  Chris Austen authored Jul 12, 2022
```
Enable the migraphx-dev package when using make|rbuild package
```
  76022598
- change to a cached github repo for blaze prereq (#1291) · fefbe99d
  Chris Austen authored Jul 12, 2022
```
bitbucket needs a port that some servers do not make available. Move the Blaze dependency from a bitbucket to a github source repo.
```
  fefbe99d
- Use current device when constructng context (#1294) · 68189043
  Paul Fultz II authored Jul 11, 2022
  
  68189043
11 Jul, 2022 2 commits
- Add __restrict__ to jit kernel params (#1300) · 2781ccd8
  turneram authored Jul 11, 2022
  
  2781ccd8
- Improve kernel code generation (#1285) · 2bbb50c4
  Paul Fultz II authored Jul 11, 2022
```
* Only run __syncthreads when there is data to preload
* Improve loops
* Add const attribute to improve optimizations
```
  2bbb50c4
08 Jul, 2022 4 commits

Update perf report to show the number of operators and per operator avg time in summary (#1287) · 05b13c9f

Paul Fultz II authored Jul 08, 2022

Show the number of operators and per operator avg time in summary...

Summary:
gpu::gemm: 8.738ms / 73 = 0.119699ms, 64%
gpu::triadd_layernorm: 0.831381ms / 24 = 0.0346409ms, 7%

05b13c9f

Add env var to enable debug symbols for gpu kernels (#1284) · adbafc06
Paul Fultz II authored Jul 08, 2022
```
Improve the assembly dump to track where certain instruction come from.
```
adbafc06

Add is_supported and get_target_assignments (#1269) · 8192f37f

varunsh authored Jul 07, 2022

Added is_supported and get_target_assignments methods to the target and program, respectively, to eventually support multi-target compilation and execution.

8192f37f

Dyn shape update (#1199) · 1c0b2a4a
Charlie Lin authored Jul 07, 2022
```
Initial sketch for changes to shape to handle dynamic dimensions
```
1c0b2a4a

07 Jul, 2022 1 commit

Add a step to unsqeeze axis (#1242) · bd503d89

Paul Fultz II authored Jul 07, 2022

Instead of just unsqueezing to an axis of 1 a step can be set to use instead. So instead of unsqueezing {3, 12} to {3, 1, 12} a step of 2 will unsqeeze to {3, 2, 6} instead

bd503d89

06 Jul, 2022 1 commit

Verify load and save (#1265) · f2531606

Paul Fultz II authored Jul 05, 2022

*In the verification tests, check that saving and reloading the program is the same program. This also fixes serialization to always load instructions in the same order. There is also fixes for deconv and quant_conv which didn't save the solution id, and was broken for serialization.

f2531606

05 Jul, 2022 2 commits

Add jit softmax (#1243) · 8520e0b8
Paul Fultz II authored Jul 05, 2022
```
* Add softmax kernel
```
8520e0b8

Horizontally fuse contiguous operators (#1232) · 27e980c4

Paul Fultz II authored Jul 05, 2022

This reorders the transposes across slice to improve horizontal fusion for contiguous. This also improves eliminate_contiguous to remove contiguous better across splits.

27e980c4