Commits · 4926f035586e5c604da3a1fa6646a16bbbe06822 · gaoqiong / MIGraphX

"src/sdk/vscode:/vscode.git/clone" did not exist on "31b2b8867e2f6f230684081e56639ea16ffd9715"

06 Dec, 2023 2 commits
- Device kernels using FP8 (#2510) · 98ef0abb
  Umang Yadav authored Dec 05, 2023
  
  98ef0abb
- Add FP8 rocblas gemm support (#2473) · 6d0b6bcf
  Umang Yadav authored Dec 05, 2023
  
  6d0b6bcf
05 Dec, 2023 4 commits
- add quant_dot as unsupported fp8 op · 39bf5dc5
  Umang Yadav authored Dec 05, 2023
  
  39bf5dc5
- add changes for the eliminate_data_type pass · cf91c2b1
  Umang Yadav authored Dec 05, 2023
  
  cf91c2b1
- remove header · b099a7da
  Umang Yadav authored Dec 05, 2023
  
  b099a7da
- use eliminate_data_type pass instead of eliminate_fp8 pass · 402c66ab
  Umang Yadav authored Dec 05, 2023
  
  402c66ab
03 Dec, 2023 4 commits
- fix naming · 51ac4fdd
  Umang Yadav authored Dec 03, 2023
  
  51ac4fdd
- use helper function to determine gfx940 · 994d24b6
  Umang Yadav authored Dec 03, 2023
  
  994d24b6
- Disable FP8 tests for the non-gfx940 arches · fe585d42
  Umang Yadav authored Dec 03, 2023
  
  fe585d42
- add pooling to unsupported ops · f18418be
  Umang Yadav authored Nov 28, 2023
  
  f18418be
01 Dec, 2023 1 commit
- FP8 GPU implementation (#2455) · eafd55de
  Umang Yadav authored Dec 01, 2023
  
  eafd55de
26 Nov, 2023 2 commits
- Move pass before optimize module · 83ce487a
  Umang Yadav authored Nov 26, 2023
  
  83ce487a
- add eliminate_fp8 pass · ad9c25ea
  Umang Yadav authored Nov 26, 2023
  
  ad9c25ea
17 Nov, 2023 1 commit
- port gpu changes · a9dd42f7
  Umang Yadav authored Nov 17, 2023
  
  a9dd42f7
30 Oct, 2023 1 commit
- Remove int8x4 format completely (#2373) · 22bb777f
  Umang Yadav authored Oct 30, 2023
  
  22bb777f
24 Sep, 2023 1 commit

create simplify_dyn_ops pass and add variable slice optimization (#2161) · c54167c7

Charlie Lin authored Sep 24, 2023

New compiler pass that simplifies dynamic shapes related operators to their static versions if possible
Will normally be used after a split_single_dyn_dim pass

c54167c7

28 Jul, 2023 1 commit
- Fix inverted logic (#2010) · eb82e8b5
  turneram authored Jul 27, 2023
  
  eb82e8b5
25 Jul, 2023 1 commit
- temporarily disable CK on Windows (#1996) · 49280e51
  Artur Wojcik authored Jul 26, 2023
  
  49280e51
21 Jun, 2023 1 commit
- use fast_math flag instead of ENV flag for GELU (#1855) · 0802c19e
  Umang Yadav authored Jun 21, 2023
```
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>
```
  0802c19e
08 Jun, 2023 1 commit
- Add initial CK integration plus auto-tuning for kernels (#1791) · 25af8710
  Paul Fultz II authored Jun 08, 2023
```
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
```
  25af8710
06 Jun, 2023 1 commit

Conditionally enable GeLU approximation (#1810) · c5d0c5b6

Umang Yadav authored Jun 05, 2023

Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf

c5d0c5b6

19 May, 2023 1 commit
- Enabling native int32 type support (#1721) · 8d9d5d1c
  Zhuoran Yin authored May 19, 2023
```
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
```
  8d9d5d1c
04 May, 2023 1 commit

[mlir] Adding quant convolution fusion as anchor op (#1683) · 7f105952

Zhuoran Yin authored May 03, 2023

Exposed the mlir_enabled() call the decide for lowering pipeline's enablement
Disabled the rewrite quantization pipeline in mlir compilation
Added quant convolution as anchor ops
Fixed the return type expectations
Added the fall back hip implementation for quantizelinear and dequantizelinear
Will need advises to improve the implementation for quantizelinear

7f105952

28 Apr, 2023 1 commit
- Removed split_single_dyn_dim compile flag (#1711) · bcc1f64a
  Charlie Lin authored Apr 28, 2023
  
  bcc1f64a
21 Apr, 2023 1 commit
- disable fusion only but create pointwise modules (#1706) · 2a44dfe9
  Umang Yadav authored Apr 21, 2023
  
  2a44dfe9
06 Apr, 2023 1 commit
- Add reduction fusion (#1614) · f201285c
  Paul Fultz II authored Apr 05, 2023
```
Automatically fuse multiple reductions and pointwise operations.
```
  f201285c
03 Apr, 2023 1 commit

promote_literals pass (#1593) · e3fb3a0d

Charlie Lin authored Apr 03, 2023

Adds the promote_literals compiler pass that moves literals from the submodules to the main module.
With the eliminate_common_subexpression pass, it will remove copies of literals created during split_single_dyn_dim.
Pass is enabled with the split_single_dyn_dim compile option.

e3fb3a0d

31 Mar, 2023 1 commit

Split single dynamic dimension compiler pass (#1580) · e9e3eacc

Charlie Lin authored Mar 30, 2023

Adds a new GPU compiler pass split_single_dyn_dim that handles when one input parameter has a single non-fixed dynamic_dimension.
commonly occurs for dynamic batch or BERT sequence length
Splits the dynamic shape into several submodules will static input parameters to handle all of the cases in the dynamic_dimension range.
Essentially does what I manually did for the select_module verify tests
Adds a compile option split_single_dyn_dim that toggles the pass on/off. Defaults to false.
Updates verify_program.hpp and run_verify.cpp to allow for the tests to change the compile_options

e9e3eacc

18 Mar, 2023 1 commit
- Dynamically plug-in backend target libs (#1608) · 7a7040aa
  Umang Yadav authored Mar 18, 2023
```
Fixes #1595
```
  7a7040aa
16 Feb, 2023 1 commit

Add flag for tuning in migraphx-driver (#1519) · cc098f4d

Umang Yadav authored Feb 15, 2023

* Add driver flag "--exhaustive-tune" to enable tuning, add support for the same in C/C++ and python API

cc098f4d

31 Jan, 2023 1 commit
- Add a general optimize pass (#1491) · a4b82653
  Paul Fultz II authored Jan 30, 2023
```
* Add general optimize pass
* Fuse gemm multiplies by scalar
* Handle zero epsilon
```
  a4b82653
29 Nov, 2022 1 commit

remove extra adjust allocation pass (#1477) · 5a2a83a4

kahmed10 authored Nov 30, 2022

Merging #1391 caused an extra adjust allocation pass for GPU targets. This removes that merge error.

5a2a83a4

02 Nov, 2022 1 commit
- Add nhwc layout to gpu backend (#1391) · 1820198e
  Paul Fultz II authored Nov 02, 2022
```
Can be enabled via environment variable MIGRAPHX_ENABLE_NHWC
```
  1820198e
26 Oct, 2022 1 commit
- rearrange default pass list; adjust_allocation must be run after rep… (#1418) · 7b9ce460
  Brian Pickrell authored Oct 26, 2022
```
Fixes an observed regression error on certain Frozen Protobuf models due to PR 1280
```
  7b9ce460
13 Oct, 2022 1 commit

Rewrite TF batch norm; remove batch_norm_inference (#1371) · be309bfb

Charlie Lin authored Oct 13, 2022

Rewrites the TF batch norm like operators to other MIGX operators
Removes the code related to batch_norm_inference

be309bfb

31 Aug, 2022 1 commit

Add pass to rewrite gelu as fast gelu (#1299) · 794a4335

turneram authored Aug 31, 2022

Rewrite_gelu pass replaces the gelu formula of x * (1/2) * (1 + erf(x/sqrt(2))) with the sigmoid approximation of x * Sigmoid(x * 1.702)

794a4335

27 Aug, 2022 1 commit

Improvements to handling and add constant passed to dot operator (#1280) · 8752875a

Paul Fultz II authored Aug 26, 2022

This will rewrite dot operators like X(Y + b) to XY + Xb when b is constant as we can fold the add away.
This improves handling pointwise with broadcasted operators, this helps improves const propagation.
Improve gemm fusion with a mul_add
Improve support for broadcast shapes in gemm

8752875a

12 Jul, 2022 1 commit
- Use current device when constructng context (#1294) · 68189043
  Paul Fultz II authored Jul 11, 2022
  
  68189043
03 Jul, 2022 1 commit

Add mlir fusion (#1251) · ca8a54fe

Paul Fultz II authored Jul 03, 2022

* Add mlir c api

* Formatting

* Create a type attribute

* Formatting

* Parse module

* Formatting

* Add mlir dump function

* Add test case

* Formatting

* Fix tidy issues

* Update mlit version

* Update to newer mlir

* Format

* Move mlir to the gpu and update the test

* Formatting

* Fix bug when appending module

* Format

* Remove old cmake flag

* Update message

* Add return

* Format

* Add mlir_compile

* Format

* Register dialect

* Handle unsinged integers

* Dont provide output for return instruction

* Format

* Add code to insert memrefs

* Format

* Add mlir verification

* Formatting

* Enable pointwise_fusion

* Disable eliminate_data_type

* Set kernal name

* Format

* Fix device name

* Formatting

* Fix output arg

* Format

* Updates

* Upate hash

* Add fuse_mlir pass

* Format

* Add fuse mlir

* Format

* Update mlir

* Sort parameter names

* Format

* Reenable disabled passes

* Remove old mlir conv

* Remove asym default padding

* Add more verbose tracing

* Format

* Fix compilation errors

* Format

* Whitelist operators

* Format

* Add namespace

* Format

* Update triple

* Format

* Use func dialect

* Format

* Use func.return

* Format

* Upgrade mlir version

* Add comment

* Handle symetrical padding

* Format

* Cleanup debug output

* Format

* List failed tests

* Move mlir compile to jit pipeline

* Format

* Update version

* Add source locations

* Format

* Correctly add module

* Format

* Update failed tests

* Fix failures when mlir is disabled

* Format

* Update mlir version

* Check type for fp32

* Format

* Remove failed test

* Update mlir in driver

* Tidy fixes

* Foramt

* Tidy fixes

* Format

* Fix const

* Remove from requirements

* Fix cmake version

* Fix tidy warning

* Use another ifdef

* Fix tidy

* Other tidy fix

* Format

* Update hash

* Add missing license files

* Format

* Format

* Fix fnction name

ca8a54fe

23 Jun, 2022 1 commit
- remove eliminate_workspace pass (#1254) · f5760e21
  kahmed10 authored Jun 23, 2022
```
* remove eliminate workspace
* remove sync device and other tags
```
  f5760e21