Commits · ab59c95ce617425fe1cde51180ce6dd5e6c876f2 · gaoqiong / MIGraphX

22 Jul, 2023 1 commit
- Add relu to no_bool_ops (#1983) · ab59c95c
  kahmed10 authored Jul 21, 2023
  
  ab59c95c
21 Jul, 2023 1 commit

Make global workitems multiple of local workitems (#1976) · 3216fe52

Umang Yadav authored Jul 20, 2023

HIP requires global work items in multiple of local work items. If it is not it is not guaranteed to generate correct results all the time.
Fixes #1977
Fixes #1644
MIGraphX CI has moved to rocm-5.6 which doesn't require hipRTC workarounds

3216fe52

19 Jul, 2023 1 commit
- Attach Preallocated buffers to MIOpen for the Find process (#1469) · 0f95c57d
  Umang Yadav authored Jul 19, 2023
  
  0f95c57d
18 Jul, 2023 1 commit
- return canonicalized standard form for GPU (#1947) · 2d5568d8
  Umang Yadav authored Jul 18, 2023
```
Fixes #1946
```
  2d5568d8
17 Jul, 2023 2 commits

add support for rocm 5.6 in CI (#1902) · 15b6e9a0

Chris Austen authored Jul 17, 2023

* add support for rocm 5.6 in CI
* Disable anonymous namespace check
* add default c'tors to avoid warnings

15b6e9a0

Enable threading in MLIR (#1899) · 5f5356cc

Krzysztof Drewniak authored Jul 17, 2023

This commit removes the build options to disable threading and removes the mutex in compile_mlir.
The commit being tested is a draft PR on rocMLIR that'll get merged if this passes

5f5356cc

13 Jul, 2023 2 commits

[NFC] Update MLIR usage to account for upstream merge (#1924) · c4765a6d
Krzysztof Drewniak authored Jul 13, 2023
```
Allows the rocMLIR CI (which builds rocMLIR tip against MIGraphX tip) to pass.
```
c4765a6d

Update deconvolution -> convolution_backwards and Dynamic Shape Support (#1801) · 4edf1195

Charlie Lin authored Jul 13, 2023

Renames deconvolution -> convolution_backwards to be more consistent with the literature
Note: this is not the cross-correlation operator (which is the adjoint of convolution). This is technically a standard convolution operator combined with an upsampling operator rather than a downsampling operator.
Adds unit tests for the padding, strides, dilations, and other op attributes.
Throws on auto_pad attribute since it has not been implemented
Previously it read the attribute and set it but then did nothing with it
Extended for dynamic shapes
Does not support using asymmetric padding (padding_L != padding_R) and output_shape with dynamic shapes.

4edf1195

11 Jul, 2023 1 commit
- Remove use of int8x4 format for the rocblas (#1920) · e1039a1c
  Umang Yadav authored Jul 11, 2023
```
* do not use int8x4 format for the rocblas
```
  e1039a1c
08 Jul, 2023 2 commits

bump up CMake minimum version required to 3.15 (#1888) · 0e144f05
Artur Wojcik authored Jul 09, 2023

0e144f05

export API symbols from dynamic libraries (#1892) · c04fbc92

Artur Wojcik authored Jul 08, 2023

Export API symbols for migraphx, migraphx_ref, migraphx_cpu, migrphx_gpu, migraphx_device, migraphx_tf, and migraphx_onnx. There is a separate PR for migrahx_c.

API symbol exporting affects only Windows. It is transparent on Linux.

c04fbc92

06 Jul, 2023 1 commit

Use MIGRAPHX_GLOBAL (#1918) · c45b34c3

Paul Fultz II authored Jul 06, 2023

This will also annotate the function with the block size so the compiler can do a better job of optimizing.

c45b34c3

05 Jul, 2023 1 commit

Update passes to use offload_copy based on root module (#1875) · e7471141

Umang Yadav authored Jul 05, 2023

Needed to run multi-targeted program where "main" isn't the only root module. There could be many root modules other than main.

e7471141

02 Jul, 2023 1 commit

Improvement to ck integration (#1859) · 3c9df3b4

Paul Fultz II authored Jul 02, 2023

Add a CI job to test CK
Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK
Continue tuning even when there is invalid configs
Fix a bug with parallel compilation not using all available threads
Add additional test for gemms using half types
Removed int32 as supported type since it doesnt pass our test suite

3c9df3b4

29 Jun, 2023 1 commit

Update list of pointwise operators supported by MLIR (#1848) · b3a610df

Krzysztof Drewniak authored Jun 29, 2023

Bump MLIR commit to include latest supported pointwise ops.
Expand the MLIR approve list 
Ensure that operations such as tanh() that don't have integer implementations (at least in MLIR) aren't used within MLIR modules.
Add additional tests.

b3a610df

28 Jun, 2023 2 commits

fix formatting (#1898) · aa55aac5
Umang Yadav authored Jun 28, 2023

aa55aac5

[mlir] Improve context handling to potentially solve threading bugs (#1867) · ac15425a

Krzysztof Drewniak authored Jun 28, 2023

Update `mlir_program` to only create one dialect registry, and to call
registerRocMLIRPasses() (which is needed and may not be thread-safe)
exactly once. 

In addition, use a single thread pool across all contexts. This is
recommended practice upstream for libraries that perform a lot of
compile jobs, and saves on the overhead of creating and destroying a
lot of threads

ac15425a

22 Jun, 2023 1 commit
- [mlir] Adding mlir quant_dot operator support (#1816) · 01342ae1
  Zhuoran Yin authored Jun 22, 2023
```
Add mlir quant_dot operator support
```
  01342ae1
21 Jun, 2023 1 commit
- use fast_math flag instead of ENV flag for GELU (#1855) · 0802c19e
  Umang Yadav authored Jun 21, 2023
```
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>
```
  0802c19e
17 Jun, 2023 1 commit

Update CK commit hash and add gfx940 to supported archs (#1842) · b8898d7e

turneram authored Jun 17, 2023

* Add initial ck_gemm code

* Format

* Add additional src files

* Format

* Add include

* Simplify fuse_ck

* Format

* Rename var

* Enable pass

* Update ck version

* Fix include

* Add group stride

* Disable warnings for ck headers

* Format

* Add unpack array

* Add interface to enable tuning

* Format

* Update compile_ops to handle tuning config

* Format

* Add some comments

* Move time_op to migraphx_gpu

* Add banchmarking

* Refactor

* Format

* Add lift class macro

* Use device name

* Format

* Generate configs

* Format

* Pass tuning parameter

* Move data type to is_ck_gemm matcher

* Format

* Add problem_cache to avoid retuning same configs

* Format

* Format

* Mark the problems

* Format

* Use is_null

* Format

* Resize vector

* Only tune with exaustive tuning

* Format

* Use assert

* FOrmat

* Tidy fixes

* More tidy fixes

* Format

* Add license to missing files

* Format

* Use transform

* Format

* Fix tidy

* Format

* Fix cppcheck issues

* Format

* Add static_assert

* Add ops header

* Add assertion in batcher

* Format

* Improve the batch fold check

* Format

* Add where op workaround for CK

* Skip if any input is not a supported ck type

* Format

* Check batch is standard

* Format

* Remove redundant static keyword

* Update commit hash

* Fix error when running without --exhaustive-tune

* Formatting

* Formatting

* Remove fuse_ck_gemm_softmax_gemm

* Update ck hash

* Correct spelling mistake

* Remove commented out logic from fuse_ck

* Remove unused include and add comment

* Formatting

* Remove redundant get_shape and remove ck_gemm from names

* Formatting

* Allow for mixed types with int8 gemms

* Formatting

* Add back find_package from merge

* Update CK commit hash and add gfx940 to fuse_ops supported archs

* Formatting

* Update CK hash

b8898d7e

15 Jun, 2023 1 commit
- use __hmax, __hmin (#1813) · d208adfc
  Umang Yadav authored Jun 15, 2023
  
  d208adfc
14 Jun, 2023 1 commit

Fix TRACE_EVAL > 1 (#1835) · 5bf067ed

Umang Yadav authored Jun 14, 2023



* add fix for the trace_eval

* Add throw for the debug builds

* Formatting

---------
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

5bf067ed

09 Jun, 2023 2 commits

Enable hipRTC (#1827) · c900e382
Chris Austen authored Jun 09, 2023

c900e382

Add missing specialization for the `nullptr` for the hash function (#1824) · 26aabd2a

Umang Yadav authored Jun 09, 2023

#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.

26aabd2a

08 Jun, 2023 2 commits
- Add initial CK integration plus auto-tuning for kernels (#1791) · 25af8710
  Paul Fultz II authored Jun 08, 2023
```
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
```
  25af8710
- disable hipRTC temporarily (#1817) · e5a33aad
  Chris Austen authored Jun 07, 2023
  
  e5a33aad
06 Jun, 2023 2 commits

re-enable hiprtc (#1812) · 85ff4f85
Umang Yadav authored Jun 06, 2023

85ff4f85

Conditionally enable GeLU approximation (#1810) · c5d0c5b6

Umang Yadav authored Jun 05, 2023

Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf

c5d0c5b6

31 May, 2023 1 commit
- Update pass manager to handle multi-target compilation (#1672) · 9473e3a2
  Umang Yadav authored May 31, 2023
```
partially solves #1656
This PR only handles compilation part of multitarget.
```
  9473e3a2
24 May, 2023 2 commits
- Change compiler_replace to a class that stores the code objects directly (#1739) · 37f5df20
  Paul Fultz II authored May 24, 2023
```
Enable retrieving the code object to do tuning in the future.
```
  37f5df20
- Update xdlops/rocblas fp32 arch (#1752) · 77042e30
  kahmed10 authored May 24, 2023
```
Refactor supported gfx archs
```
  77042e30
23 May, 2023 1 commit
- Backout fp16 max/min HIP API change (#1771) · 42772fd6
  Umang Yadav authored May 23, 2023
```
back out changes for rocm-5.5
```
  42772fd6
20 May, 2023 1 commit
- Use half HIP APIs to compute max and min (#1764) · 88fb551c
  Umang Yadav authored May 19, 2023
```
* use half hip functions to compute max and min
* add verify test for min and max
```
  88fb551c
19 May, 2023 1 commit
- Enabling native int32 type support (#1721) · 8d9d5d1c
  Zhuoran Yin authored May 19, 2023
```
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
```
  8d9d5d1c
17 May, 2023 1 commit
- adjust docker files to support new rocm 5.5 (#1729) · 5e35957b
  Chris Austen authored May 17, 2023
```
Move CI to support the rocm5.5 release
```
  5e35957b
08 May, 2023 1 commit
- Remove workaround for Sin (#1701) · 89f7ac0d
  Umang Yadav authored May 08, 2023
  
  89f7ac0d
05 May, 2023 1 commit
- [MLIR][5.7] add input fusion support for view ops (#1705) · 4996c6d7
  Manupa Karunaratne authored May 05, 2023
```
Adds support for slice,transpose,contigous and reshape fusions into input tensors for a fused mlir kernel.
```
  4996c6d7
04 May, 2023 1 commit

[mlir] Adding quant convolution fusion as anchor op (#1683) · 7f105952

Zhuoran Yin authored May 03, 2023

Exposed the mlir_enabled() call the decide for lowering pipeline's enablement
Disabled the rewrite quantization pipeline in mlir compilation
Added quant convolution as anchor ops
Fixed the return type expectations
Added the fall back hip implementation for quantizelinear and dequantizelinear
Will need advises to improve the implementation for quantizelinear

7f105952

28 Apr, 2023 1 commit
- Removed split_single_dyn_dim compile flag (#1711) · bcc1f64a
  Charlie Lin authored Apr 28, 2023
  
  bcc1f64a
25 Apr, 2023 1 commit
- update rocBLAS version check to support 3.0 and above (#1716) · ed6542ee
  kahmed10 authored Apr 25, 2023
```
update rocBLAS version check to support 3.0 and above with simplified logic
```
  ed6542ee