Commits · 7b2a5ccf9f5b545f1b9190d14fe823014e05a0b1 · gaoqiong / MIGraphX

"megatron/legacy/vscode:/vscode.git/clone" did not exist on "4d19cbacccda71c2748bbd8c41437e7afd7e6f9c"

13 Apr, 2023 1 commit
- [mlir] Adding quantizelinear, dequantizelinear and quant_convolution support (#1675) · 7b2a5ccf
  Zhuoran Yin authored Apr 13, 2023
  
  7b2a5ccf
12 Apr, 2023 1 commit
- Print out pass name when tracing passes (#1667) · 551b927c
  Paul Fultz II authored Apr 12, 2023
  
  551b927c
11 Apr, 2023 1 commit
- Enable tidy on gpu driver (#1659) · 3385dcc8
  Paul Fultz II authored Apr 11, 2023
  
  3385dcc8
10 Apr, 2023 2 commits

Always build ref target when building MIGraphX (#1636) · cce35871
Umang Yadav authored Apr 10, 2023

cce35871

Fix 2 input broadcast bug for dynamic batch and output parameter ordering (#1669) · d3eb5609

Charlie Lin authored Apr 10, 2023

Adds a matcher to split_single_dyn_dim to find all broadcast or multibroadcast with two static shape inputs and replaces the instruction with the one input version.
Sorts the get_output_parameters() list to ensure the correct ordering. (Was getting an error for some models.)

d3eb5609

09 Apr, 2023 1 commit
- Enable hiprtc by default (#1658) · db6c75e7
  Paul Fultz II authored Apr 09, 2023
```
* Enable hiprtc by default
```
  db6c75e7
07 Apr, 2023 1 commit

Require the same type for the inputs and scales for QuantizeLinear (#1642) · f6e22d56

Paul Fultz II authored Apr 06, 2023

Converts can be inserted when the scales and input differ in the onnx file(we are already doing this implicit conversion in the ref implementation). This will also improve the compile-time of quantizelinear.hpp since we can remove the nested visit method.

f6e22d56

06 Apr, 2023 2 commits

Driver dynamic batch update (#1652) · adccec52

Charlie Lin authored Apr 06, 2023

Examples..

bin/driver verify /codes/onnx_models/resnet50-v1-7/resnet50-v1-7.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim @data "[{min:1, max:4}, 3, 224, 224]"

bin/driver compile /codes/onnx_models/resnet50-v1-7/resnet50-v1-7.onnx --split-single-dyn-dim --default-dyn-dim "{min:1, max:10}" --output resnet50_batch1-10.mxr

bin/driver perf resnet50_batch1-10.mxr --batch 4

adccec52

Add reduction fusion (#1614) · f201285c
Paul Fultz II authored Apr 05, 2023
```
Automatically fuse multiple reductions and pointwise operations.
```
f201285c

05 Apr, 2023 3 commits
- Add MIGRAPHX_VALIDATE_MATCHES env variable to validate each matcher (#1372) · a123cb2e
  Paul Fultz II authored Apr 05, 2023
```
* Add MIGRAPHX_VALIDATE_MATCHES env variable to validate each matcher
```
  a123cb2e
- Optimize add convolution (#1549) · df32040d
  Paul Fultz II authored Apr 05, 2023
```
This will replace conv(x+a, w) with conv(x, w) + conv(a, w) where a is a constant so conv(a, w) can be replaced with a constant.
```
  df32040d
- Add missing header for sles and centos (#1665) · 8beb6680
  Paul Fultz II authored Apr 04, 2023
  
  8beb6680
04 Apr, 2023 2 commits

fix bug in transpose_slice simplification (#1660) · 30af1697

shivadbhavsar authored Apr 04, 2023

Bug found due to failing torch benchmark. Added test case to reproduce issue causing the model to error out on compile.
Original logic results in the following error:
AMDMIGraphX/src/include/migraphx/op/unsqueeze.hpp:128: normalize_compute_shape: UNSQUEEZE: Axis dimenstion is not divisible by step

30af1697

Refactor dynamic_dimension to have multiple optimals (#1625) · e7ec374f

Charlie Lin authored Apr 04, 2023

Makes the optimals into a std::set<std::size_t>
Changes shape object functions to handle the opts change
Changes to convolution, flatten, pooling, and convolution in that they no longer calculate the output optimal dimensions. Instead returns empty opts. Will need to change this in the future if we want to support dynamic shapes fully.
Many changes to tests and shape calls with respect to the new optimals

e7ec374f

03 Apr, 2023 2 commits

fix stable diffusion decoder non standard shape issue (#1594) · 1329b9be
shivadbhavsar authored Apr 03, 2023

1329b9be

promote_literals pass (#1593) · e3fb3a0d

Charlie Lin authored Apr 03, 2023

Adds the promote_literals compiler pass that moves literals from the submodules to the main module.
With the eliminate_common_subexpression pass, it will remove copies of literals created during split_single_dyn_dim.
Pass is enabled with the split_single_dyn_dim compile option.

e3fb3a0d

01 Apr, 2023 1 commit
- Enable header tests for FPGA and CPU backend (#1634) · 6a0a5ffe
  Umang Yadav authored Apr 01, 2023
  
  6a0a5ffe
31 Mar, 2023 1 commit

Split single dynamic dimension compiler pass (#1580) · e9e3eacc

Charlie Lin authored Mar 30, 2023

Adds a new GPU compiler pass split_single_dyn_dim that handles when one input parameter has a single non-fixed dynamic_dimension.
commonly occurs for dynamic batch or BERT sequence length
Splits the dynamic shape into several submodules will static input parameters to handle all of the cases in the dynamic_dimension range.
Essentially does what I manually did for the select_module verify tests
Adds a compile option split_single_dyn_dim that toggles the pass on/off. Defaults to false.
Updates verify_program.hpp and run_verify.cpp to allow for the tests to change the compile_options

e9e3eacc

30 Mar, 2023 1 commit
- Enable parallel compilation with hiprtc (#1647) · 32b9fd08
  Paul Fultz II authored Mar 30, 2023
```
* Add hiprtc driver
```
  32b9fd08
29 Mar, 2023 1 commit
- Fix bug when concatting with the vectorization axis (#1653) · b1506c73
  Paul Fultz II authored Mar 29, 2023
  
  b1506c73
28 Mar, 2023 1 commit
- Remove version name from check_context (#1639) · 49fc6138
  Umang Yadav authored Mar 28, 2023
```
* Remove version from check_context and bump program version
```
  49fc6138
27 Mar, 2023 1 commit

[MLIR] add dot offloads with manual tuning support (#1631) · 7c4dc99a

Manupa Karunaratne authored Mar 27, 2023

* [MLIR] add dot offloads with manual tuning support
* This commit adds dot + pointwise fusion support
along with manual tuning using rocMLIR.

7c4dc99a

25 Mar, 2023 1 commit
- remove /opt/rocm (#1623) · 018e5318
  Umang Yadav authored Mar 24, 2023
```
Co-authored-by: Chris Austen <causten@users.noreply.github.com>
```
  018e5318
22 Mar, 2023 1 commit
- Use version number as part of internal namespace symbol (#1633) · 09aaa63e
  Umang Yadav authored Mar 21, 2023
```
prevent dynamically loading the target library that is not compiled with the same version of MIGraphX core lib.
```
  09aaa63e
21 Mar, 2023 2 commits

select_module refactor (#1615) · 94a7f6ee

Charlie Lin authored Mar 21, 2023

Refactor to have select_module use output parameters
Disable select_module verify tests on cpu

94a7f6ee

Fix default target in driver (#1635) · 11e2451f

Umang Yadav authored Mar 20, 2023

Recent changes #1608 removed migraphx_all_target lib from driver and that led to missing compile time definitions.
Missing compile definitions led to change of default target in driver.

11e2451f

18 Mar, 2023 1 commit
- Dynamically plug-in backend target libs (#1608) · 7a7040aa
  Umang Yadav authored Mar 18, 2023
```
Fixes #1595
```
  7a7040aa
17 Mar, 2023 1 commit

Fold const on last instruction (#1626) · 450c5e84

Paul Fultz II authored Mar 17, 2023

This is the original testcase that sparked the error with missing proper const
folding. Pushing changes up to this branch and closing out the PR #1622

450c5e84

13 Mar, 2023 2 commits
- [Cleanup] Suppress warnings for msgpack-cxx and remove old memory coloring passes (#1611) · 1741708f
  Umang Yadav authored Mar 13, 2023
  
  1741708f
- [MLIR] Adds a runtime switch to trigger MLIR (#1610) · 2db587ea
  Manupa Karunaratne authored Mar 13, 2023
```
* [MLIR] Adds a runtime switch to trigger MLIR
```
  2db587ea
10 Mar, 2023 2 commits
- Fix make_inner_storage function (#1607) · 5e132673
  Paul Fultz II authored Mar 10, 2023
  
  5e132673
- Fix static_assert in large reduction (#1604) · 206b9a51
  Paul Fultz II authored Mar 09, 2023
  
  206b9a51
09 Mar, 2023 1 commit
- update call to msgpackc-cxx (#1603) · 69799cb8
  Akash Patel authored Mar 09, 2023
```
fallback to msgpack for older msgpack versions
```
  69799cb8
07 Mar, 2023 1 commit
- Prune candidates in NMS (#1601) · e4975990
  Umang Yadav authored Mar 07, 2023
```
* NMS improvements
```
  e4975990
04 Mar, 2023 1 commit
- Update rocm-cmake and half package URL (#1581) · f33ffb91
  Umang Yadav authored Mar 03, 2023
```
resolve half info messages while building 
```
  f33ffb91
01 Mar, 2023 1 commit
- Initial (#1586) · d1b5f332
  Charlie Lin authored Mar 01, 2023
```
Add additional documentation to explain the passes.
```
  d1b5f332
28 Feb, 2023 1 commit

Select module op (#1569) · a63ee2e0

Charlie Lin authored Feb 28, 2023

Creates the select_module operator that selects one of the submodules passed to it to run based on the submodule parameters.  The submodule is selected by having the exact same static shapes for the arguments to select_module as the parameters in the submodule

a63ee2e0

23 Feb, 2023 1 commit
- Modify layernorm to allow higher overflow limit for lower precision (#1534) · 3c67e66f
  shivadbhavsar authored Feb 22, 2023
  
  3c67e66f
16 Feb, 2023 2 commits

Copy into registers first when doing reductions with layernorm and softmax (#1489) · ac531d99

Paul Fultz II authored Feb 16, 2023

Avoids double global loads.  Strided loops are unrolled which lets store results in array which compiler will use registers for since the index access is constant.   Updated to handle large reductions so which results with a better stable diffusion result

ac531d99

Remove HCC (#1546) · bfd77388
Umang Yadav authored Feb 16, 2023
```
* deprecate HCC
```
bfd77388