Commits · 1530ec241fcea60cc766a7c6d09d1a1e1a8d0d23 · gaoqiong / MIGraphX

16 Sep, 2023 1 commit
- Enable mlir quick tuning (#2183) · 205306ac
  Paul Fultz II authored Sep 15, 2023
  
  205306ac
13 Sep, 2023 1 commit
- Disable unsafe buffer usage warning when its available (#2168) · e05c94b4
  Paul Fultz II authored Sep 13, 2023
  
  e05c94b4
12 Sep, 2023 1 commit
- Ensure same layout is used for miopen convolution (#2102) · 64b306ab
  Paul Fultz II authored Sep 12, 2023
  
  64b306ab
10 Aug, 2023 1 commit

[MLIR] Changes for tuning API v2, MLIR grid layout changes. (#1961) · 065d06af

Krzysztof Drewniak authored Aug 09, 2023

This PR constitutes the MIGraphX-side changes needed to not break the build in the presence of
ROCmSoftwarePlatform/rocMLIR#1136 , and updates what data is sent in to MLIR during the kernel generation and tuning process.

065d06af

09 Aug, 2023 1 commit
- Remove gcnArch usage (#2053) · dd219d6e
  Paul Fultz II authored Aug 09, 2023
  
  dd219d6e
08 Aug, 2023 1 commit
- Update to Cppcheck 2.11 (#1914) · a359d2c8
  Paul Fultz II authored Aug 08, 2023
  
  a359d2c8
30 Jul, 2023 1 commit

Enable tuning for MLIR (#1965) · be6ecff6

Paul Fultz II authored Jul 30, 2023

* Add initial tuning support

* Format

* Add extra param

* Format

* Use exauhstive flag

* Format

* Set expected shapes

* Format

* Format

* Fix missing symbol

* Format

* Add missing license header

* Format

* Update src/targets/gpu/include/migraphx/gpu/mlir.hpp

be6ecff6

19 Jul, 2023 1 commit
- Attach Preallocated buffers to MIOpen for the Find process (#1469) · 0f95c57d
  Umang Yadav authored Jul 19, 2023
  
  0f95c57d
18 Jul, 2023 1 commit
- return canonicalized standard form for GPU (#1947) · 2d5568d8
  Umang Yadav authored Jul 18, 2023
```
Fixes #1946
```
  2d5568d8
13 Jul, 2023 1 commit

Update deconvolution -> convolution_backwards and Dynamic Shape Support (#1801) · 4edf1195

Charlie Lin authored Jul 13, 2023

Renames deconvolution -> convolution_backwards to be more consistent with the literature
Note: this is not the cross-correlation operator (which is the adjoint of convolution). This is technically a standard convolution operator combined with an upsampling operator rather than a downsampling operator.
Adds unit tests for the padding, strides, dilations, and other op attributes.
Throws on auto_pad attribute since it has not been implemented
Previously it read the attribute and set it but then did nothing with it
Extended for dynamic shapes
Does not support using asymmetric padding (padding_L != padding_R) and output_shape with dynamic shapes.

4edf1195

08 Jul, 2023 1 commit

export API symbols from dynamic libraries (#1892) · c04fbc92

Artur Wojcik authored Jul 08, 2023

Export API symbols for migraphx, migraphx_ref, migraphx_cpu, migrphx_gpu, migraphx_device, migraphx_tf, and migraphx_onnx. There is a separate PR for migrahx_c.

API symbol exporting affects only Windows. It is transparent on Linux.

c04fbc92

05 Jul, 2023 1 commit

Update passes to use offload_copy based on root module (#1875) · e7471141

Umang Yadav authored Jul 05, 2023

Needed to run multi-targeted program where "main" isn't the only root module. There could be many root modules other than main.

e7471141

02 Jul, 2023 1 commit

Improvement to ck integration (#1859) · 3c9df3b4

Paul Fultz II authored Jul 02, 2023

Add a CI job to test CK
Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK
Continue tuning even when there is invalid configs
Fix a bug with parallel compilation not using all available threads
Add additional test for gemms using half types
Removed int32 as supported type since it doesnt pass our test suite

3c9df3b4

08 Jun, 2023 1 commit
- Add initial CK integration plus auto-tuning for kernels (#1791) · 25af8710
  Paul Fultz II authored Jun 08, 2023
```
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
```
  25af8710
24 May, 2023 1 commit
- Change compiler_replace to a class that stores the code objects directly (#1739) · 37f5df20
  Paul Fultz II authored May 24, 2023
```
Enable retrieving the code object to do tuning in the future.
```
  37f5df20
17 May, 2023 1 commit
- adjust docker files to support new rocm 5.5 (#1729) · 5e35957b
  Chris Austen authored May 17, 2023
```
Move CI to support the rocm5.5 release
```
  5e35957b
04 May, 2023 1 commit

[mlir] Adding quant convolution fusion as anchor op (#1683) · 7f105952

Zhuoran Yin authored May 03, 2023

Exposed the mlir_enabled() call the decide for lowering pipeline's enablement
Disabled the rewrite quantization pipeline in mlir compilation
Added quant convolution as anchor ops
Fixed the return type expectations
Added the fall back hip implementation for quantizelinear and dequantizelinear
Will need advises to improve the implementation for quantizelinear

7f105952

24 Apr, 2023 1 commit

Dynamic shape hip::copy_to_gpu and hip::copy_from_gpu (#1694) · 84acaea0

Charlie Lin authored Apr 24, 2023

Updates the hip::copy_to_gpu and hip::copy_from_gpu operators to work with dynamic shapes

Allows for offload_copy to be used with dynamic batch

Changed assert in select_module because the argument might now be smaller with how offload_copy will work with dynamic batch. (maximum buffer size will be used)

84acaea0

06 Apr, 2023 1 commit
- Add reduction fusion (#1614) · f201285c
  Paul Fultz II authored Apr 05, 2023
```
Automatically fuse multiple reductions and pointwise operations.
```
  f201285c
05 Apr, 2023 1 commit

Optimize add convolution (#1549) · df32040d

Paul Fultz II authored Apr 05, 2023

This will replace conv(x+a, w) with conv(x, w) + conv(a, w) where a is a constant so conv(a, w) can be replaced with a constant.

df32040d

30 Mar, 2023 1 commit
- Enable parallel compilation with hiprtc (#1647) · 32b9fd08
  Paul Fultz II authored Mar 30, 2023
```
* Add hiprtc driver
```
  32b9fd08
28 Mar, 2023 1 commit
- Remove version name from check_context (#1639) · 49fc6138
  Umang Yadav authored Mar 28, 2023
```
* Remove version from check_context and bump program version
```
  49fc6138
21 Mar, 2023 1 commit

select_module refactor (#1615) · 94a7f6ee

Charlie Lin authored Mar 21, 2023

Refactor to have select_module use output parameters
Disable select_module verify tests on cpu

94a7f6ee

18 Mar, 2023 1 commit
- Dynamically plug-in backend target libs (#1608) · 7a7040aa
  Umang Yadav authored Mar 18, 2023
```
Fixes #1595
```
  7a7040aa
01 Mar, 2023 1 commit
- Initial (#1586) · d1b5f332
  Charlie Lin authored Mar 01, 2023
```
Add additional documentation to explain the passes.
```
  d1b5f332
16 Feb, 2023 1 commit

Add flag for tuning in migraphx-driver (#1519) · cc098f4d

Umang Yadav authored Feb 15, 2023

* Add driver flag "--exhaustive-tune" to enable tuning, add support for the same in C/C++ and python API

cc098f4d

14 Feb, 2023 1 commit

Set device to current hip device when loading programs (#1561) · 4e11431d

shivadbhavsar authored Feb 14, 2023

Currently, we default to device 0 when loading programs. Updating this to use hipGetDevice to set the device for the loaded program.

4e11431d

06 Feb, 2023 1 commit

Update matcher to better match layernorm fusions in other models (#1548) · 3e58b1e4

Paul Fultz II authored Feb 06, 2023



* Fuse layernorm with different patterns
* Only match when using the last axis
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

3e58b1e4

07 Dec, 2022 1 commit
- Fix conversion issue in layernorm fusion (#1483) · 37c3c4a9
  Paul Fultz II authored Dec 07, 2022
```
* Add implicit_conversion
```
  37c3c4a9
07 Nov, 2022 1 commit
- Update rocblas header include path (#1444) · df2e7635
  arvindcheru authored Nov 07, 2022
  
  df2e7635
02 Nov, 2022 1 commit
- Add nhwc layout to gpu backend (#1391) · 1820198e
  Paul Fultz II authored Nov 02, 2022
```
Can be enabled via environment variable MIGRAPHX_ENABLE_NHWC
```
  1820198e
19 Oct, 2022 2 commits

Refactor dynamic compute; Dynamic ref unary functions (#1407) · 693cb5d8

Charlie Lin authored Oct 19, 2022

Refactor dynamic compute
- add a compute_output_shape object that implicitly converts to a new dyn_output or shape object
- dyn_output object can handle computing the static output shape of an operator given the input arguments shapes
  change an operator's compute function to argument compute(const dyn_output& dyn_out, std::vector<argument> args) to 
  use dyn_output object

Dynamic ref unary functions
-  Included these changes to have an example of the refactored dynamic compute being used
-  Changes to unary base class to handle dynamic shapes
-  Changed elu and leaky_relu to use unary base class and pointwise JIT

693cb5d8

Find2.0 changes for the Quant and De-Convolution (#1408) · 5fa42993

Umang Yadav authored Oct 19, 2022



* use find2.0 for the convolution
Co-authored-by: Vasilii Filippov <DrizztDoUrden@users.noreply.github.com>
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

5fa42993

18 Oct, 2022 1 commit

Add support in mlir for transposed and broadcasted shaped (#1378) · c3e02b18

Paul Fultz II authored Oct 18, 2022



* Enable non-standard shape
* Use perfdb for non xdlops
* Fix transpose+broadcast strides
Co-authored-by: jungpark-mlir <jungwook.park@amd.com>

c3e02b18

13 Oct, 2022 1 commit

Rewrite TF batch norm; remove batch_norm_inference (#1371) · be309bfb

Charlie Lin authored Oct 13, 2022

Rewrites the TF batch norm like operators to other MIGX operators
Removes the code related to batch_norm_inference

be309bfb

04 Oct, 2022 1 commit
- Stream sync Changset (#1358) · f7d987ba
  Ted Themistokleous authored Oct 04, 2022
```
Stream sync changes and associated API level changes
```
  f7d987ba
29 Sep, 2022 1 commit

Use find_2.0 API for the convolution (#1346) · e19f78ae

Umang Yadav authored Sep 29, 2022

Improvements/Additions to be made:

changes for the quant_convolution,
changes for the deconvolution,
Macros for MIOpen status checks

e19f78ae

28 Sep, 2022 1 commit

Add compute_fp32 flag for quant_gemm tests (#1360) · 70e63960

Umang Yadav authored Sep 28, 2022

test_gpu_pack_int8_args fails on gfx908 machine, because it doesn't set compute_fp32 flag correctly. This PR fixes the test such that it checks for the device-name, and rocblas-versions and sets this flag accordingly.

70e63960

26 Sep, 2022 1 commit
- Use larger vector size instead of preloading for broadcasted inputs (#1389) · 492c4a6c
  Paul Fultz II authored Sep 26, 2022
  
  492c4a6c
23 Sep, 2022 1 commit
- Remove unused device functions (#1394) · 8ea8473d
  Paul Fultz II authored Sep 23, 2022
```
* Remove device functions
* Update tests
```
  8ea8473d