Commits · 6e6cb9947c3cb5ea88daf02a4508a3506a50dc7d · gaoqiong / MIGraphX

21 Jul, 2022 2 commits
- Change ownership to company email (#1310) · 6e6cb994
  Chris Austen authored Jul 21, 2022
```
Remove specific person name from deb created packages and move toward a general maintainer id/email
```
  6e6cb994
- Dynamic check_shapes (#1295) · c51cf531
  Charlie Lin authored Jul 21, 2022
```
Dynamic shape handling in shape object
```
  c51cf531
19 Jul, 2022 3 commits

Fix TF parsing for creating literals and Fix name lookups for input params (#1298) · 4d59b7c7

Umang Yadav authored Jul 19, 2022

Bug 1: create_literal was using back_inserter to copy vector with already allocated size, causing double the size of literal.
Fix 1 : not use back_inserter
Bug 2: Input param to model can be from operation that has multiple output, in that case name of the input param would contain : e.g. input_1:0
Fix 2: Look for : and take substring

4d59b7c7

Dynamic dimension input onnx parser (#1249) · 5a87fcbd

Charlie Lin authored Jul 19, 2022

Depends on #1199

Adds ONNX parser functionality for dynamic input shapes.
Uses options parameter in parse_onnx()

5a87fcbd

Fix op includes (#1308) · 39b307b2

Charlie Lin authored Jul 19, 2022

Changes to operator includes:

removed some includes that were not used
included argument.hpp where clang-tidy wanted it

39b307b2

15 Jul, 2022 1 commit

Fix test case for min & max operators (#1305) · dfc91e2c

Ted Themistokleous authored Jul 15, 2022

Fix min_test.onnx generation as well as add a proper check to the parse program vs the expect program.
Adding this in to fix test converge for the min case.

dfc91e2c

12 Jul, 2022 5 commits
- Reduce header inclusion in op headers (#1271) · ba1b7850
  Paul Fultz II authored Jul 12, 2022
```
Reduce header inclusion in op headers
```
  ba1b7850
- Add tests for C API (#1266) · a7a32a9e
  Paul Fultz II authored Jul 12, 2022
```
This will ensure that migraphx.h can be included from a C compiler, and check that the C API can be called. This includes stdbool.h which is needed when using bool from C.
```
  a7a32a9e
- create the dev package (#1293) · 76022598
  Chris Austen authored Jul 12, 2022
```
Enable the migraphx-dev package when using make|rbuild package
```
  76022598
- change to a cached github repo for blaze prereq (#1291) · fefbe99d
  Chris Austen authored Jul 12, 2022
```
bitbucket needs a port that some servers do not make available. Move the Blaze dependency from a bitbucket to a github source repo.
```
  fefbe99d
- Use current device when constructng context (#1294) · 68189043
  Paul Fultz II authored Jul 11, 2022
  
  68189043
11 Jul, 2022 2 commits
- Add __restrict__ to jit kernel params (#1300) · 2781ccd8
  turneram authored Jul 11, 2022
  
  2781ccd8
- Improve kernel code generation (#1285) · 2bbb50c4
  Paul Fultz II authored Jul 11, 2022
```
* Only run __syncthreads when there is data to preload
* Improve loops
* Add const attribute to improve optimizations
```
  2bbb50c4
08 Jul, 2022 4 commits

Update perf report to show the number of operators and per operator avg time in summary (#1287) · 05b13c9f

Paul Fultz II authored Jul 08, 2022

Show the number of operators and per operator avg time in summary...

Summary:
gpu::gemm: 8.738ms / 73 = 0.119699ms, 64%
gpu::triadd_layernorm: 0.831381ms / 24 = 0.0346409ms, 7%

05b13c9f

Add env var to enable debug symbols for gpu kernels (#1284) · adbafc06
Paul Fultz II authored Jul 08, 2022
```
Improve the assembly dump to track where certain instruction come from.
```
adbafc06

Add is_supported and get_target_assignments (#1269) · 8192f37f

varunsh authored Jul 07, 2022

Added is_supported and get_target_assignments methods to the target and program, respectively, to eventually support multi-target compilation and execution.

8192f37f

Dyn shape update (#1199) · 1c0b2a4a
Charlie Lin authored Jul 07, 2022
```
Initial sketch for changes to shape to handle dynamic dimensions
```
1c0b2a4a

07 Jul, 2022 1 commit

Add a step to unsqeeze axis (#1242) · bd503d89

Paul Fultz II authored Jul 07, 2022

Instead of just unsqueezing to an axis of 1 a step can be set to use instead. So instead of unsqueezing {3, 12} to {3, 1, 12} a step of 2 will unsqeeze to {3, 2, 6} instead

bd503d89

06 Jul, 2022 1 commit

Verify load and save (#1265) · f2531606

Paul Fultz II authored Jul 05, 2022

*In the verification tests, check that saving and reloading the program is the same program. This also fixes serialization to always load instructions in the same order. There is also fixes for deconv and quant_conv which didn't save the solution id, and was broken for serialization.

f2531606

05 Jul, 2022 2 commits

Add jit softmax (#1243) · 8520e0b8
Paul Fultz II authored Jul 05, 2022
```
* Add softmax kernel
```
8520e0b8

Horizontally fuse contiguous operators (#1232) · 27e980c4

Paul Fultz II authored Jul 05, 2022

This reorders the transposes across slice to improve horizontal fusion for contiguous. This also improves eliminate_contiguous to remove contiguous better across splits.

27e980c4

03 Jul, 2022 1 commit

Add mlir fusion (#1251) · ca8a54fe

Paul Fultz II authored Jul 03, 2022

* Add mlir c api

* Formatting

* Create a type attribute

* Formatting

* Parse module

* Formatting

* Add mlir dump function

* Add test case

* Formatting

* Fix tidy issues

* Update mlit version

* Update to newer mlir

* Format

* Move mlir to the gpu and update the test

* Formatting

* Fix bug when appending module

* Format

* Remove old cmake flag

* Update message

* Add return

* Format

* Add mlir_compile

* Format

* Register dialect

* Handle unsinged integers

* Dont provide output for return instruction

* Format

* Add code to insert memrefs

* Format

* Add mlir verification

* Formatting

* Enable pointwise_fusion

* Disable eliminate_data_type

* Set kernal name

* Format

* Fix device name

* Formatting

* Fix output arg

* Format

* Updates

* Upate hash

* Add fuse_mlir pass

* Format

* Add fuse mlir

* Format

* Update mlir

* Sort parameter names

* Format

* Reenable disabled passes

* Remove old mlir conv

* Remove asym default padding

* Add more verbose tracing

* Format

* Fix compilation errors

* Format

* Whitelist operators

* Format

* Add namespace

* Format

* Update triple

* Format

* Use func dialect

* Format

* Use func.return

* Format

* Upgrade mlir version

* Add comment

* Handle symetrical padding

* Format

* Cleanup debug output

* Format

* List failed tests

* Move mlir compile to jit pipeline

* Format

* Update version

* Add source locations

* Format

* Correctly add module

* Format

* Update failed tests

* Fix failures when mlir is disabled

* Format

* Update mlir version

* Check type for fp32

* Format

* Remove failed test

* Update mlir in driver

* Tidy fixes

* Foramt

* Tidy fixes

* Format

* Fix const

* Remove from requirements

* Fix cmake version

* Fix tidy warning

* Use another ifdef

* Fix tidy

* Other tidy fix

* Format

* Update hash

* Add missing license files

* Format

* Format

* Fix fnction name

ca8a54fe

30 Jun, 2022 1 commit

Add method to insert multiple instructions (#1178) · 2783c649

Paul Fultz II authored Jun 29, 2022

This is an extension to insert_module_instructions, but instead of just inserting from a module, it can insert a range or a vector of instructions.

2783c649

29 Jun, 2022 4 commits
- Invalid parameter for yolov4 example (#1275) · 9f74dded
  Chris Austen authored Jun 29, 2022
```
should be --fp16 , not --fp16ref
```
  9f74dded
- NMS refactor, enable nonstandard shape (#1257) · ad73abbc
  Charlie Lin authored Jun 29, 2022
```
Allows PyTorch converted version of SSD-resnet34 to work
```
  ad73abbc
- Update driver models to use json strings (#1244) · ad27d0d6
  Paul Fultz II authored Jun 29, 2022
```
 Compiles significantly faster than constructing all the objects. It also reduces recompiles as well.
```
  ad27d0d6
- Custom Op example using MIOpen calls (#1208) · 56440c4a
  Umang Yadav authored Jun 28, 2022
```
This PR only adds an example using MIOpen Calls.
```
  56440c4a
28 Jun, 2022 2 commits
- Custom Op example using rocBLAS calls (#1211) · e914254c
  Umang Yadav authored Jun 28, 2022
```
Add an example using rocBLAS Calls
```
  e914254c
- Custom Op example using HIP kernel (#1200) · cb18b0b5
  Umang Yadav authored Jun 28, 2022
```
This PR only adds an example using HIP kernel.
```
  cb18b0b5
26 Jun, 2022 1 commit
- Get parent module in the pass manager (#1181) · 3a5c4306
  Paul Fultz II authored Jun 26, 2022
```
* Add function to get a module tree
* Get parent module in the pass manager
```
  3a5c4306
25 Jun, 2022 2 commits
- bug fix: register the miopen_fusion op. (#1267) · 3b0a9116
  Brian Pickrell authored Jun 24, 2022
```
One-line fix to register the op miopen_fusion. This error was causing loading of compiled model files (*.mxr) to fail.
```
  3b0a9116
- Use jit for contiguous operator (#1217) · b75c83d8
  Paul Fultz II authored Jun 24, 2022
```
* Jit contiguous
```
  b75c83d8
24 Jun, 2022 2 commits

Adding in check_stamped.py to tools/ (#1255) · 8c35fa94

Ted Themistokleous authored Jun 24, 2022

Used to determine what files contain a license and are stamped. If not we exit and return an error code that can be later ingested by another script, as well as a list of the outstanding files in questions.

Currently baked in the list of files we should support or not support with licenses in them a well as some stuff to quickly ignore

8c35fa94

Add compute_method for the experimental custom op (#1194) · edc7be5c

Umang Yadav authored Jun 24, 2022

Adds compute_method for the experimental custom ops.
Adds a test for the same using HIP APIs.
Depends on #1183
Solves #1101

edc7be5c

23 Jun, 2022 2 commits
- remove eliminate_workspace pass (#1254) · f5760e21
  kahmed10 authored Jun 23, 2022
```
* remove eliminate workspace
* remove sync device and other tags
```
  f5760e21
- Fix code block issue with .ipynb files. (#1263) · e95b875f
  Ted Themistokleous authored Jun 22, 2022
```
Regenerate notebook header for licensing
```
  e95b875f
22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
20 Jun, 2022 1 commit
- Fixing misspelled macro to enable MIOpen hidden find mode API (#1250) · c0398ded
  Zhuoran Yin authored Jun 20, 2022
```
* Fixing misspelled macro
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
```
  c0398ded
17 Jun, 2022 2 commits

Update lowering of Dot operator (#1247) · c99be32c

Umang Yadav authored Jun 17, 2022



* remove code for allocation of C param in dot lowering

* formatting
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

c99be32c

Update tf_parser to have add_common_op() for parse_relu6 (#1241) · 421a5621

Ted Themistokleous authored Jun 17, 2022



* [#935] Update tf_parser to have add_common_op() for parse_relu6

Similar to that of the onnx_parser.cpp add a add_common_op template and functionality to support clip based operations. This is done so clip operations can be guarenteed to have the same dimensions.

* fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* fixup! fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6

* Formatting

* fixup! Formatting
Co-authored-by: Umang Yadav <29876643+umangyadav@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

421a5621