Commits · e05c94b45cd7d7db80c340fc1690ab62e4b9fa7b · gaoqiong / MIGraphX

13 Sep, 2023 1 commit
- Disable unsafe buffer usage warning when its available (#2168) · e05c94b4
  Paul Fultz II authored Sep 13, 2023
  
  e05c94b4
29 Jul, 2023 1 commit
- Updates to add_embed_library (#2009) · 6ca5abd9
  Paul Fultz II authored Jul 29, 2023
```
* Updates to add_embed_library

* Fix warnings for extern arrays
```
  6ca5abd9
21 Jul, 2023 1 commit

Make global workitems multiple of local workitems (#1976) · 3216fe52

Umang Yadav authored Jul 20, 2023

HIP requires global work items in multiple of local work items. If it is not it is not guaranteed to generate correct results all the time.
Fixes #1977
Fixes #1644
MIGraphX CI has moved to rocm-5.6 which doesn't require hipRTC workarounds

3216fe52

08 Jun, 2023 1 commit
- Add initial CK integration plus auto-tuning for kernels (#1791) · 25af8710
  Paul Fultz II authored Jun 08, 2023
```
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
```
  25af8710
17 May, 2023 1 commit
- adjust docker files to support new rocm 5.5 (#1729) · 5e35957b
  Chris Austen authored May 17, 2023
```
Move CI to support the rocm5.5 release
```
  5e35957b
31 Jan, 2023 1 commit

hipRTC fixes (#1531) · 91cc7242

Umang Yadav authored Jan 31, 2023

Added CMakeFlag for hipRTC. MIGRAPHX_USE_HIPRTC.
Added stages in Jenkins for hipRTC.
Fixes for some of the pending issues from hipRTC.

91cc7242

28 Oct, 2022 1 commit

Use minimum block size of 64 threads (#1427) · 25a0e433

Umang Yadav authored Oct 28, 2022

Local Threads of multiples 32 were introduced in #1348
But LocalThreads that are not multiple of 64 are causing correctness issues.

25a0e433

19 Sep, 2022 1 commit

Improve layernorm and reductions performance (#1348) · 97a1ed2d

Paul Fultz II authored Sep 19, 2022

Compute mean and variance in same reduction
Set block size to numbers divisible by 32 instead powers of 2
Global is also set exactly instead of being divisible by block size
More exact matching of global/local can help get rid of branching/loops
Reduce vectors first before doing dpp_reduce
Explicitly vectorize array operators since the compiler doesnt always vectorize them
Still uses old for loop when its computing at compile-time since the reinterpret_cast nor the all the vector types is supported

97a1ed2d

11 Jul, 2022 1 commit
- Add __restrict__ to jit kernel params (#1300) · 2781ccd8
  turneram authored Jul 11, 2022
  
  2781ccd8
22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
10 Jun, 2022 1 commit

Add vectorized reduce (#1202) · aa7ff911

Paul Fultz II authored Jun 09, 2022



Consolidate the vectorize and preload
Add vectorization to reduction
Co-authored-by: kahmed10 <15948690+kahmed10@users.noreply.github.com>

aa7ff911

29 Mar, 2022 1 commit

Refactor runtime compiled kernels to use the same compile_ops pipeline (#1125) · 661046c6

Paul Fultz II authored Mar 29, 2022

This adds the infrastructure so we can compile everything in parallel, whereas before only pointwise kernels were compiled in parallel. This will also directly integrate with lowering and the gpu-driver. The kernels for pointwise and roialign are using this infrastructure. Scatternd is not since it does require standard shape.

This also makes it easier to add new runtime compiled kernels in the future.

661046c6

28 Jan, 2022 1 commit

Add auto-vectorization of pointwise operators (#1047) · 78a3c9b7

Paul Fultz II authored Jan 28, 2022

* Enable auto vectorization
* Handle vector types with convert function
* Dont vectorize when it will cause problems with preload

78a3c9b7

07 Dec, 2021 1 commit
- Rename reduce_inputs to virtual_inputs (#1021) · 1793cc54
  Paul Fultz II authored Dec 07, 2021
```
simple variable rename
```
  1793cc54
19 Aug, 2021 1 commit
- Enable warnings when jit compiling (#913) · ccff6beb
  Paul Fultz II authored Aug 19, 2021
```
* Enable warnings when jit compiling

* Formatting
```
  ccff6beb
10 Aug, 2021 1 commit

Add option to compile with hiprtc (#892) · 91c9ebbc

Paul Fultz II authored Aug 10, 2021

* Add hiprtc compile option
* Add cross compile test
* Update error reporting
* Add tests for errors and warnings
* Fix tidy warning
* Add comment to ifdefs
* Skip null character at end of log
* Assert there is null at the end

91c9ebbc

05 Aug, 2021 1 commit

Add gpu driver and improvements to pointwise codegen (#851) · 29fa2666

Paul Fultz II authored Aug 05, 2021



* Add method to compile pointwise

* Formatting

* Add lambda

* Add semicolon

* Rename variable

* Add driver to run jit kernels

* Formatting

* Add context

* Formatting

* Make seperate driver folder

* Add more general gpu driver

* Formatting

* Print out wll time

* Formatting

* Run multiple times and skip first run

* Formatting

* Seperate time_op

* Run an op for comparison

* Formatting

* Add debug asserts

* Formatting

* Change parameer name

* Formatting

* Fix argument order

* Formatting

* Add preloading

* Formatting

* Allow a different data type

* Formatting

* Pipeline transformations

* Formatting

* Add vectorization

* Formatting

* Reduce dims

* Formatting

* Compile with launch params as constant

* Formatting

* Make sure buffer can be vecotrized

* Formatting

* Enable vectorization and preloading

* Formatting

* Add print header

* Formatting

* Avoid allocating to large of LDS

* Formatting

* Add some vec functions to a seperate header

* Formatting

* Add stride loops

* Formatting

* Improve the transform pipeline

* Formatting

* Add const

* Fix shape check

* Formatting

* Just check stride axis is zero

* Remove extra finc_vector_axis overload

* Simplify some mroe functions

* Formatting

* Remove some more extra functions

* Formatting

* Simplify more decltypes

* Add another const

* Fix test

* Get buffer pointer different for older compilers
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

29fa2666

14 Jul, 2021 1 commit

Use the same device name function in the unit tests (#881) · 0b04fc80

Paul Fultz II authored Jul 14, 2021



* Unify device_name function

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0b04fc80

27 Apr, 2021 1 commit

Add tuple type to shape (#800) · 66aa4cc8

Paul Fultz II authored Apr 27, 2021



* Add definitions for all pointwise operators

* Formatting

* Add cpp generator class

* Formatting

* Move compilation to core

* Formatting

* Add clock to tmp name

* Add dynamic loader

* Formatting

* Add tests for code gen

* Formatting

* Add test for literals

* Formatting

* Use with_char

* Add missing header

* Fix mismerge

* Ignore tidy warning

* Fxx gcc 5 errors

* Apply fixits

* Skip signed bitwise of status

* Remove unused parameters

* Explicitly add c++14 flag

* Fix tidy warning

* Add tuple type to shape class

* Formatting

* Make data member private

* Formatting

* Add sub arguments

* Formatting

* Trun clang format off

* Disable clang-format

* Improve visiting tuples

* Formatting

* Add more argument tests

* Formatting

* Handle tuple in load

* Formatting

* Remove .o files

* Add tuple type to api

* Formatting

* Fix tidy warnings

* Fix tidy warnings

* Add a test for share method

* Formatting

* Add a test cpp_type

* Suppress tidy warning
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>

66aa4cc8

26 Mar, 2021 1 commit

Add initial code generation (#762) · 581d31b0

Paul Fultz II authored Mar 26, 2021



* Add code object op

* Formattting

* Add more value tests

* Formatting

* Fix from_value conversion from binary

* Formatting

* Dont use offload copy

* Remove iostream header

* Fix compilation errors

* Formatting

* Rename var

* Add missing files

* Formatting

* Remove duplicate variable

* Remove comment

* Template the function so sfinae will work

* Formatting

* Use template specialization since ADL is broken on hcc

* Formatting

* Annotate the constructor with HD for hcc

* Make variable const
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

581d31b0