- 11 Oct, 2023 1 commit
-
-
Alan Turner authored
-
- 06 Oct, 2023 1 commit
-
-
Artur Wojcik authored
-
- 04 Oct, 2023 2 commits
-
-
Alan Turner authored
-
Alan Turner authored
-
- 03 Oct, 2023 2 commits
-
-
Alan Turner authored
-
Alan Turner authored
-
- 27 Sep, 2023 1 commit
-
-
Alan Turner authored
-
- 21 Sep, 2023 3 commits
-
-
Alan Turner authored
-
Alan Turner authored
-
Chris Austen authored
* Rectify flipped coordinate_transformation_mode logic in ROIAlign * Handle both opset 10 and 16 versions * Fix version check and clang tidy warning Co-authored-by:Dino Musić <dino.music@htecgroup.com>
-
- 16 Sep, 2023 1 commit
-
-
Paul Fultz II authored
-
- 30 Aug, 2023 2 commits
-
-
Alan Turner authored
-
Alan Turner authored
-
- 10 Aug, 2023 1 commit
-
-
Krzysztof Drewniak authored
This PR constitutes the MIGraphX-side changes needed to not break the build in the presence of ROCmSoftwarePlatform/rocMLIR#1136 , and updates what data is sent in to MLIR during the kernel generation and tuning process.
-
- 08 Aug, 2023 1 commit
-
-
Paul Fultz II authored
-
- 30 Jul, 2023 1 commit
-
-
Paul Fultz II authored
* Add initial tuning support * Format * Add extra param * Format * Use exauhstive flag * Format * Set expected shapes * Format * Format * Fix missing symbol * Format * Add missing license header * Format * Update src/targets/gpu/include/migraphx/gpu/mlir.hpp
-
- 28 Jul, 2023 1 commit
-
-
Paul Fultz II authored
* Improve performance of pointwise/reduction kernels when using NHWC layouts * Format * Add nhwc test * Format * Remove inline namespace * Add reduce test
-
- 06 Jul, 2023 1 commit
-
-
Paul Fultz II authored
This will also annotate the function with the block size so the compiler can do a better job of optimizing.
-
- 02 Jul, 2023 1 commit
-
-
Paul Fultz II authored
Add a CI job to test CK Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK Continue tuning even when there is invalid configs Fix a bug with parallel compilation not using all available threads Add additional test for gemms using half types Removed int32 as supported type since it doesnt pass our test suite
-
- 17 Jun, 2023 1 commit
-
-
turneram authored
* Add initial ck_gemm code * Format * Add additional src files * Format * Add include * Simplify fuse_ck * Format * Rename var * Enable pass * Update ck version * Fix include * Add group stride * Disable warnings for ck headers * Format * Add unpack array * Add interface to enable tuning * Format * Update compile_ops to handle tuning config * Format * Add some comments * Move time_op to migraphx_gpu * Add banchmarking * Refactor * Format * Add lift class macro * Use device name * Format * Generate configs * Format * Pass tuning parameter * Move data type to is_ck_gemm matcher * Format * Add problem_cache to avoid retuning same configs * Format * Format * Mark the problems * Format * Use is_null * Format * Resize vector * Only tune with exaustive tuning * Format * Use assert * FOrmat * Tidy fixes * More tidy fixes * Format * Add license to missing files * Format * Use transform * Format * Fix tidy * Format * Fix cppcheck issues * Format * Add static_assert * Add ops header * Add assertion in batcher * Format * Improve the batch fold check * Format * Add where op workaround for CK * Skip if any input is not a supported ck type * Format * Check batch is standard * Format * Remove redundant static keyword * Update commit hash * Fix error when running without --exhaustive-tune * Formatting * Formatting * Remove fuse_ck_gemm_softmax_gemm * Update ck hash * Correct spelling mistake * Remove commented out logic from fuse_ck * Remove unused include and add comment * Formatting * Remove redundant get_shape and remove ck_gemm from names * Formatting * Allow for mixed types with int8 gemms * Formatting * Add back find_package from merge * Update CK commit hash and add gfx940 to fuse_ops supported archs * Formatting * Update CK hash
-
- 09 Jun, 2023 1 commit
-
-
Umang Yadav authored
#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.
-
- 08 Jun, 2023 1 commit
-
-
Paul Fultz II authored
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
-
- 24 May, 2023 1 commit
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
- 06 Apr, 2023 1 commit
-
-
Paul Fultz II authored
Automatically fuse multiple reductions and pointwise operations.
-
- 29 Mar, 2023 1 commit
-
-
Paul Fultz II authored
-
- 27 Mar, 2023 1 commit
-
-
Manupa Karunaratne authored
* [MLIR] add dot offloads with manual tuning support * This commit adds dot + pointwise fusion support along with manual tuning using rocMLIR.
-
- 10 Mar, 2023 1 commit
-
-
Paul Fultz II authored
-
- 16 Feb, 2023 1 commit
-
-
Paul Fultz II authored
Avoids double global loads. Strided loops are unrolled which lets store results in array which compiler will use registers for since the index access is constant. Updated to handle large reductions so which results with a better stable diffusion result
-
- 17 Jan, 2023 1 commit
-
-
Paul Fultz II authored
-
- 09 Jan, 2023 1 commit
-
-
Ted Themistokleous authored
JIT implementation of the gather operator Added a few more unit tests to this one as well since I saw some odd behavior during bring up.
-
- 06 Dec, 2022 1 commit
-
-
jungpark-mlir authored
Update dialect registration interface Update 2nd build pipeline call and use full arch name
-
- 02 Nov, 2022 2 commits
-
-
Paul Fultz II authored
Can be enabled via environment variable MIGRAPHX_ENABLE_NHWC
-
Paul Fultz II authored
-
- 27 Oct, 2022 1 commit
-
-
kahmed10 authored
updated GPU pad to now use JIT version. added range functions for JIT kernels.
-
- 19 Oct, 2022 1 commit
-
-
Charlie Lin authored
Refactor dynamic compute - add a compute_output_shape object that implicitly converts to a new dyn_output or shape object - dyn_output object can handle computing the static output shape of an operator given the input arguments shapes change an operator's compute function to argument compute(const dyn_output& dyn_out, std::vector<argument> args) to use dyn_output object Dynamic ref unary functions - Included these changes to have an example of the refactored dynamic compute being used - Changes to unary base class to handle dynamic shapes - Changed elu and leaky_relu to use unary base class and pointwise JIT
-
- 18 Oct, 2022 1 commit
-
-
Paul Fultz II authored
* Enable non-standard shape * Use perfdb for non xdlops * Fix transpose+broadcast strides Co-authored-by:jungpark-mlir <jungwook.park@amd.com>
-
- 04 Oct, 2022 1 commit
-
-
Paul Fultz II authored
optimize the softmax operator
-
- 29 Sep, 2022 1 commit
-
-
Umang Yadav authored
Improvements/Additions to be made: changes for the quant_convolution, changes for the deconvolution, Macros for MIOpen status checks
-
- 26 Sep, 2022 1 commit
-
-
Paul Fultz II authored
-
- 21 Sep, 2022 1 commit
-
-
kahmed10 authored
This PR allows for other values of epsilon to be matched when finding layernorm. Similarly, the calculation now uses the variable for epsilon.
-