Commits · bc7db1044a246ab68ece004eeac47efc0fa383f0 · gaoqiong / MIGraphX

11 Oct, 2023 2 commits

Fix scatter operator for nonstandard shapes (#2314) · bc7db104

Ted Themistokleous authored Oct 11, 2023



* Fix scatter operator for nonstandard shapes

remove standard() shape check for scatter inputs.

* Add nostandard input tests for scatter

---------
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

bc7db104

a few c++ fixes to allow compilation on Windows (#2282) · a50cb302
Artur Wojcik authored Oct 11, 2023

a50cb302

28 Sep, 2023 1 commit

Add options to set tolerances inside MIGraphX driver (#2213) · 69d8d789

Umang Yadav authored Sep 28, 2023

MIGraphX verification by default uses normalized RMS error as the basis for the verification.  This change adds some logic to allow migraphx to do "np.allclose" type of elementwise verification using atol and rtol.

Commit also includes changes to consistently pass "gold" or "expected" results as the second argument for "verify_range()" calls.  Default RMS tolerance inside driver is set to 0.001 which IMO is high for FP32 compared to what we had earlier. Need better defaults

69d8d789

27 Sep, 2023 1 commit
- Dont insert reshapes when converting pooling to reduce (#2149) · 7c8f6c25
  Paul Fultz II authored Sep 27, 2023
  
  7c8f6c25
21 Sep, 2023 1 commit
- Enable tests package (#2166) · 52eb36fb
  Paul Fultz II authored Sep 21, 2023
  
  52eb36fb
15 Sep, 2023 1 commit
- Preserve layout of fused kernel for `layernorm+pointwise` (#2185) · 15acaee9
  Umang Yadav authored Sep 15, 2023
  
  15acaee9
18 Aug, 2023 1 commit
- Remove operators.hpp includes (#2086) · e6290061
  Paul Fultz II authored Aug 18, 2023
  
  e6290061
28 Jul, 2023 1 commit

Improve performance of pointwise/reduction kernels when using NHWC layouts (#1955) · f33f2298

Paul Fultz II authored Jul 28, 2023

* Improve performance of pointwise/reduction kernels when using NHWC layouts

* Format

* Add nhwc test

* Format

* Remove inline namespace

* Add reduce test

f33f2298

22 Jul, 2023 1 commit
- Throw on calling `shape.lens()` or `shape.strides()` on a dynamic shape and vice versa (#1937) · 4e24e65a
  Charlie Lin authored Jul 22, 2023
```
Throwing on these calls catches dynamic shape errors earlier rather than having to backpedal from a bad call
```
  4e24e65a
13 Jul, 2023 1 commit

Update deconvolution -> convolution_backwards and Dynamic Shape Support (#1801) · 4edf1195

Charlie Lin authored Jul 13, 2023

Renames deconvolution -> convolution_backwards to be more consistent with the literature
Note: this is not the cross-correlation operator (which is the adjoint of convolution). This is technically a standard convolution operator combined with an upsampling operator rather than a downsampling operator.
Adds unit tests for the padding, strides, dilations, and other op attributes.
Throws on auto_pad attribute since it has not been implemented
Previously it read the attribute and set it but then did nothing with it
Extended for dynamic shapes
Does not support using asymmetric padding (padding_L != padding_R) and output_shape with dynamic shapes.

4edf1195

10 Jul, 2023 1 commit

Pooling op. calculation changes (#1823) · bb06dbf5

Brian Pickrell authored Jul 09, 2023

Changes to the way Pooling operation calculates pooling when there's padding. Old code would clip off any padding values before computing; for instance if an Average pooling window contained 0 1 2 where the 0 is padding, the result was 1.5 instead of 1.0. See Issue 1766

bb06dbf5

08 Jul, 2023 1 commit
- bump up CMake minimum version required to 3.15 (#1888) · 0e144f05
  Artur Wojcik authored Jul 09, 2023
  
  0e144f05
06 Jul, 2023 1 commit
- fix compilation warnings causing build failures (-Werror) (#1889) · a83371ca
  Artur Wojcik authored Jul 07, 2023
  
  a83371ca
02 Jul, 2023 1 commit

Improvement to ck integration (#1859) · 3c9df3b4

Paul Fultz II authored Jul 02, 2023

Add a CI job to test CK
Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK
Continue tuning even when there is invalid configs
Fix a bug with parallel compilation not using all available threads
Add additional test for gemms using half types
Removed int32 as supported type since it doesnt pass our test suite

3c9df3b4

23 Jun, 2023 1 commit
- Remove clamping for converts (#1853) · e794a63c
  Umang Yadav authored Jun 23, 2023
```
Fixes #1852  Fixes #1847
```
  e794a63c
01 Jun, 2023 1 commit

Convert Fp16 instance-norm to FP32 temporarily (#1779) · 49b341d3

Umang Yadav authored Jun 01, 2023

By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy.

By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.

49b341d3

25 May, 2023 1 commit
- Update cpp generator to handle inf from float (#1758) · 763dd1da
  Ted Themistokleous authored May 25, 2023
```
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf 
```
  763dd1da
20 May, 2023 1 commit
- Use half HIP APIs to compute max and min (#1764) · 88fb551c
  Umang Yadav authored May 19, 2023
```
* use half hip functions to compute max and min
* add verify test for min and max
```
  88fb551c
04 May, 2023 1 commit

Rewrite multiplies with dot operator (#1685) · 457703a8

Paul Fultz II authored May 04, 2023

When multiplying either the input or output across the K dimensions then the multiple can be applied to the constant which can then be folded with propagate_const.

457703a8

28 Apr, 2023 1 commit
- Removed split_single_dyn_dim compile flag (#1711) · bcc1f64a
  Charlie Lin authored Apr 28, 2023
  
  bcc1f64a
24 Apr, 2023 2 commits
- Fix compile failure in reduction fusion of instance norm (#1702) · 08360e83
  Paul Fultz II authored Apr 24, 2023
```
This fixes #1700
```
  08360e83
- Fix incorrect assertion in vec_packed_at (#1704) · 4339af75
  Paul Fultz II authored Apr 23, 2023
  
  4339af75
07 Apr, 2023 1 commit

Require the same type for the inputs and scales for QuantizeLinear (#1642) · f6e22d56

Paul Fultz II authored Apr 06, 2023

Converts can be inserted when the scales and input differ in the onnx file(we are already doing this implicit conversion in the ref implementation). This will also improve the compile-time of quantizelinear.hpp since we can remove the nested visit method.

f6e22d56

05 Apr, 2023 1 commit

Optimize add convolution (#1549) · df32040d

Paul Fultz II authored Apr 05, 2023

This will replace conv(x+a, w) with conv(x, w) + conv(a, w) where a is a constant so conv(a, w) can be replaced with a constant.

df32040d

31 Mar, 2023 1 commit

Split single dynamic dimension compiler pass (#1580) · e9e3eacc

Charlie Lin authored Mar 30, 2023

Adds a new GPU compiler pass split_single_dyn_dim that handles when one input parameter has a single non-fixed dynamic_dimension.
commonly occurs for dynamic batch or BERT sequence length
Splits the dynamic shape into several submodules will static input parameters to handle all of the cases in the dynamic_dimension range.
Essentially does what I manually did for the select_module verify tests
Adds a compile option split_single_dyn_dim that toggles the pass on/off. Defaults to false.
Updates verify_program.hpp and run_verify.cpp to allow for the tests to change the compile_options

e9e3eacc

29 Mar, 2023 1 commit
- Fix bug when concatting with the vectorization axis (#1653) · b1506c73
  Paul Fultz II authored Mar 29, 2023
  
  b1506c73
21 Mar, 2023 1 commit

select_module refactor (#1615) · 94a7f6ee

Charlie Lin authored Mar 21, 2023

Refactor to have select_module use output parameters
Disable select_module verify tests on cpu

94a7f6ee

18 Mar, 2023 1 commit
- Dynamically plug-in backend target libs (#1608) · 7a7040aa
  Umang Yadav authored Mar 18, 2023
```
Fixes #1595
```
  7a7040aa
17 Mar, 2023 2 commits
- Remove test_gather_literal_inputs test (#1628) · 9ef6801e
  Paul Fultz II authored Mar 17, 2023
  
  9ef6801e
- Fold const on last instruction (#1626) · 450c5e84
  Paul Fultz II authored Mar 17, 2023
```
This is the original testcase that sparked the error with missing proper const
folding. Pushing changes up to this branch and closing out the PR #1622
```
  450c5e84
10 Mar, 2023 2 commits
- Fix make_inner_storage function (#1607) · 5e132673
  Paul Fultz II authored Mar 10, 2023
  
  5e132673
- Fix static_assert in large reduction (#1604) · 206b9a51
  Paul Fultz II authored Mar 09, 2023
  
  206b9a51
28 Feb, 2023 1 commit

Select module op (#1569) · a63ee2e0

Charlie Lin authored Feb 28, 2023

Creates the select_module operator that selects one of the submodules passed to it to run based on the submodule parameters.  The submodule is selected by having the exact same static shapes for the arguments to select_module as the parameters in the submodule

a63ee2e0

23 Feb, 2023 1 commit
- Modify layernorm to allow higher overflow limit for lower precision (#1534) · 3c67e66f
  shivadbhavsar authored Feb 22, 2023
  
  3c67e66f
16 Feb, 2023 1 commit

Copy into registers first when doing reductions with layernorm and softmax (#1489) · ac531d99

Paul Fultz II authored Feb 16, 2023

Avoids double global loads.  Strided loops are unrolled which lets store results in array which compiler will use registers for since the index access is constant.   Updated to handle large reductions so which results with a better stable diffusion result

ac531d99

17 Jan, 2023 1 commit
- Use float accumulator when reduction size is too large for half (#1515) · 3af50e07
  Paul Fultz II authored Jan 17, 2023
  
  3af50e07
13 Jan, 2023 1 commit
- Transpose slice fix (#1499) · 2c8149f6
  shivadbhavsar authored Jan 13, 2023
```
This PR resolves the bug addressed in #1496. 
```
  2c8149f6
11 Jan, 2023 1 commit
- Use cosine to compute half sin (#1508) · 3fb5c0ef
  Paul Fultz II authored Jan 11, 2023
```
* Use cosine to compute half sin
```
  3fb5c0ef
09 Jan, 2023 1 commit

Add JIT Gather Operator (#1492) · 054364cd

Ted Themistokleous authored Jan 09, 2023

JIT implementation of the gather operator
Added a few more unit tests to this one as well since I saw some odd behavior during bring up.

054364cd

02 Nov, 2022 1 commit
- Concat pointwise fusions (#1388) · 2f48b11a
  Paul Fultz II authored Nov 02, 2022
  
  2f48b11a