- 22 Mar, 2023 1 commit
-
-
Paul authored
-
- 21 Mar, 2023 1 commit
-
-
Charlie Lin authored
Refactor to have select_module use output parameters Disable select_module verify tests on cpu
-
- 18 Mar, 2023 1 commit
-
-
Umang Yadav authored
Fixes #1595
-
- 13 Mar, 2023 1 commit
-
-
Manupa Karunaratne authored
* [MLIR] Adds a runtime switch to trigger MLIR
-
- 10 Mar, 2023 2 commits
-
-
Paul Fultz II authored
-
Paul Fultz II authored
-
- 01 Mar, 2023 1 commit
-
-
Charlie Lin authored
Add additional documentation to explain the passes.
-
- 28 Feb, 2023 1 commit
-
-
Charlie Lin authored
Creates the select_module operator that selects one of the submodules passed to it to run based on the submodule parameters. The submodule is selected by having the exact same static shapes for the arguments to select_module as the parameters in the submodule
-
- 23 Feb, 2023 1 commit
-
-
shivadbhavsar authored
-
- 16 Feb, 2023 3 commits
-
-
Paul Fultz II authored
Avoids double global loads. Strided loops are unrolled which lets store results in array which compiler will use registers for since the index access is constant. Updated to handle large reductions so which results with a better stable diffusion result
-
Umang Yadav authored
* deprecate HCC
-
Umang Yadav authored
* Add driver flag "--exhaustive-tune" to enable tuning, add support for the same in C/C++ and python API
-
- 14 Feb, 2023 1 commit
-
-
shivadbhavsar authored
Currently, we default to device 0 when loading programs. Updating this to use hipGetDevice to set the device for the loaded program.
-
- 10 Feb, 2023 1 commit
-
-
Umang Yadav authored
-
- 06 Feb, 2023 1 commit
-
-
Paul Fultz II authored
* Fuse layernorm with different patterns * Only match when using the last axis Co-authored-by:
kahmed10 <15948690+kahmed10@users.noreply.github.com> Co-authored-by:
kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
- 31 Jan, 2023 2 commits
-
-
Umang Yadav authored
Added CMakeFlag for hipRTC. MIGRAPHX_USE_HIPRTC. Added stages in Jenkins for hipRTC. Fixes for some of the pending issues from hipRTC.
-
Paul Fultz II authored
* Add general optimize pass * Fuse gemm multiplies by scalar * Handle zero epsilon
-
- 19 Jan, 2023 1 commit
-
-
Paul Fultz II authored
This prevents multiple adds.
-
- 17 Jan, 2023 1 commit
-
-
Paul Fultz II authored
-
- 11 Jan, 2023 1 commit
-
-
Paul Fultz II authored
* Use cosine to compute half sin
-
- 09 Jan, 2023 1 commit
-
-
Ted Themistokleous authored
JIT implementation of the gather operator Added a few more unit tests to this one as well since I saw some odd behavior during bring up.
-
- 11 Dec, 2022 1 commit
-
-
Umang Yadav authored
HIP had change in previous rocm releases to use --offload-arch instead of --cuda-gpu-arch. This should be backwards compatbile. hipRTC also supports --offload-arch.
-
- 07 Dec, 2022 2 commits
-
-
Paul Fultz II authored
* Add implicit_conversion
-
Paul authored
-
- 06 Dec, 2022 5 commits
-
-
Paul authored
-
Paul authored
-
Ted Themistokleous authored
Need this for when we debug and use MIGRAPHX_TRACE_EVAL() to show tuples Without this we break when reading our buffer due to the use of visit() This came up as part of #1283 debugging.
-
Paul authored
-
jungpark-mlir authored
Update dialect registration interface Update 2nd build pipeline call and use full arch name
-
- 05 Dec, 2022 3 commits
- 03 Dec, 2022 7 commits
- 01 Dec, 2022 1 commit
-
-
Paul authored
-