- 21 Sep, 2023 2 commits
-
-
Alan Turner authored
-
Alan Turner authored
-
- 30 Aug, 2023 2 commits
-
-
Alan Turner authored
-
Alan Turner authored
-
- 10 Aug, 2023 1 commit
-
-
Krzysztof Drewniak authored
This PR constitutes the MIGraphX-side changes needed to not break the build in the presence of ROCmSoftwarePlatform/rocMLIR#1136 , and updates what data is sent in to MLIR during the kernel generation and tuning process.
-
- 08 Aug, 2023 1 commit
-
-
Paul Fultz II authored
-
- 30 Jul, 2023 1 commit
-
-
Paul Fultz II authored
* Add initial tuning support * Format * Add extra param * Format * Use exauhstive flag * Format * Set expected shapes * Format * Format * Fix missing symbol * Format * Add missing license header * Format * Update src/targets/gpu/include/migraphx/gpu/mlir.hpp
-
- 28 Jul, 2023 1 commit
-
-
Paul Fultz II authored
* Improve performance of pointwise/reduction kernels when using NHWC layouts * Format * Add nhwc test * Format * Remove inline namespace * Add reduce test
-
- 06 Jul, 2023 1 commit
-
-
Paul Fultz II authored
This will also annotate the function with the block size so the compiler can do a better job of optimizing.
-
- 02 Jul, 2023 1 commit
-
-
Paul Fultz II authored
Add a CI job to test CK Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK Continue tuning even when there is invalid configs Fix a bug with parallel compilation not using all available threads Add additional test for gemms using half types Removed int32 as supported type since it doesnt pass our test suite
-
- 17 Jun, 2023 1 commit
-
-
turneram authored
* Add initial ck_gemm code * Format * Add additional src files * Format * Add include * Simplify fuse_ck * Format * Rename var * Enable pass * Update ck version * Fix include * Add group stride * Disable warnings for ck headers * Format * Add unpack array * Add interface to enable tuning * Format * Update compile_ops to handle tuning config * Format * Add some comments * Move time_op to migraphx_gpu * Add banchmarking * Refactor * Format * Add lift class macro * Use device name * Format * Generate configs * Format * Pass tuning parameter * Move data type to is_ck_gemm matcher * Format * Add problem_cache to avoid retuning same configs * Format * Format * Mark the problems * Format * Use is_null * Format * Resize vector * Only tune with exaustive tuning * Format * Use assert * FOrmat * Tidy fixes * More tidy fixes * Format * Add license to missing files * Format * Use transform * Format * Fix tidy * Format * Fix cppcheck issues * Format * Add static_assert * Add ops header * Add assertion in batcher * Format * Improve the batch fold check * Format * Add where op workaround for CK * Skip if any input is not a supported ck type * Format * Check batch is standard * Format * Remove redundant static keyword * Update commit hash * Fix error when running without --exhaustive-tune * Formatting * Formatting * Remove fuse_ck_gemm_softmax_gemm * Update ck hash * Correct spelling mistake * Remove commented out logic from fuse_ck * Remove unused include and add comment * Formatting * Remove redundant get_shape and remove ck_gemm from names * Formatting * Allow for mixed types with int8 gemms * Formatting * Add back find_package from merge * Update CK commit hash and add gfx940 to fuse_ops supported archs * Formatting * Update CK hash
-
- 09 Jun, 2023 1 commit
-
-
Umang Yadav authored
#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.
-
- 08 Jun, 2023 1 commit
-
-
Paul Fultz II authored
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
-
- 24 May, 2023 1 commit
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
- 06 Apr, 2023 1 commit
-
-
Paul Fultz II authored
Automatically fuse multiple reductions and pointwise operations.
-
- 29 Mar, 2023 1 commit
-
-
Paul Fultz II authored
-
- 27 Mar, 2023 1 commit
-
-
Manupa Karunaratne authored
* [MLIR] add dot offloads with manual tuning support * This commit adds dot + pointwise fusion support along with manual tuning using rocMLIR.
-
- 10 Mar, 2023 1 commit
-
-
Paul Fultz II authored
-
- 16 Feb, 2023 1 commit
-
-
Paul Fultz II authored
Avoids double global loads. Strided loops are unrolled which lets store results in array which compiler will use registers for since the index access is constant. Updated to handle large reductions so which results with a better stable diffusion result
-
- 17 Jan, 2023 1 commit
-
-
Paul Fultz II authored
-
- 09 Jan, 2023 1 commit
-
-
Ted Themistokleous authored
JIT implementation of the gather operator Added a few more unit tests to this one as well since I saw some odd behavior during bring up.
-
- 06 Dec, 2022 1 commit
-
-
jungpark-mlir authored
Update dialect registration interface Update 2nd build pipeline call and use full arch name
-
- 02 Nov, 2022 2 commits
-
-
Paul Fultz II authored
Can be enabled via environment variable MIGRAPHX_ENABLE_NHWC
-
Paul Fultz II authored
-
- 27 Oct, 2022 1 commit
-
-
kahmed10 authored
updated GPU pad to now use JIT version. added range functions for JIT kernels.
-
- 19 Oct, 2022 1 commit
-
-
Charlie Lin authored
Refactor dynamic compute - add a compute_output_shape object that implicitly converts to a new dyn_output or shape object - dyn_output object can handle computing the static output shape of an operator given the input arguments shapes change an operator's compute function to argument compute(const dyn_output& dyn_out, std::vector<argument> args) to use dyn_output object Dynamic ref unary functions - Included these changes to have an example of the refactored dynamic compute being used - Changes to unary base class to handle dynamic shapes - Changed elu and leaky_relu to use unary base class and pointwise JIT
-
- 18 Oct, 2022 1 commit
-
-
Paul Fultz II authored
* Enable non-standard shape * Use perfdb for non xdlops * Fix transpose+broadcast strides Co-authored-by:jungpark-mlir <jungwook.park@amd.com>
-
- 04 Oct, 2022 1 commit
-
-
Paul Fultz II authored
optimize the softmax operator
-
- 29 Sep, 2022 1 commit
-
-
Umang Yadav authored
Improvements/Additions to be made: changes for the quant_convolution, changes for the deconvolution, Macros for MIOpen status checks
-
- 26 Sep, 2022 1 commit
-
-
Paul Fultz II authored
-
- 21 Sep, 2022 1 commit
-
-
kahmed10 authored
This PR allows for other values of epsilon to be matched when finding layernorm. Similarly, the calculation now uses the variable for epsilon.
-
- 14 Sep, 2022 1 commit
-
-
Paul Fultz II authored
* Implement concat using jit compilation
-
- 08 Sep, 2022 1 commit
-
-
Paul Fultz II authored
* Remove unused headers
-
- 17 Aug, 2022 1 commit
-
-
Paul Fultz II authored
-
- 25 Jul, 2022 1 commit
-
-
Ted Themistokleous authored
* Add in changes for onnx Mod operator Initial operator for mod implementation and test cases for integer and floating based types. Need to use fmod from stdlib for floating point types. half_float::half thankfully is specced to the use the existing std::fmod() call when looking at the half.hpp implementation. fmod_flag should mirror the onnx fmod attribute. Right now using a floating point type without setting that on the user side to true will result in an exception. Ref ticket #1283
-
- 05 Jul, 2022 1 commit
-
-
Paul Fultz II authored
* Add softmax kernel
-
- 03 Jul, 2022 1 commit
-
-
Paul Fultz II authored
* Add mlir c api * Formatting * Create a type attribute * Formatting * Parse module * Formatting * Add mlir dump function * Add test case * Formatting * Fix tidy issues * Update mlit version * Update to newer mlir * Format * Move mlir to the gpu and update the test * Formatting * Fix bug when appending module * Format * Remove old cmake flag * Update message * Add return * Format * Add mlir_compile * Format * Register dialect * Handle unsinged integers * Dont provide output for return instruction * Format * Add code to insert memrefs * Format * Add mlir verification * Formatting * Enable pointwise_fusion * Disable eliminate_data_type * Set kernal name * Format * Fix device name * Formatting * Fix output arg * Format * Updates * Upate hash * Add fuse_mlir pass * Format * Add fuse mlir * Format * Update mlir * Sort parameter names * Format * Reenable disabled passes * Remove old mlir conv * Remove asym default padding * Add more verbose tracing * Format * Fix compilation errors * Format * Whitelist operators * Format * Add namespace * Format * Update triple * Format * Use func dialect * Format * Use func.return * Format * Upgrade mlir version * Add comment * Handle symetrical padding * Format * Cleanup debug output * Format * List failed tests * Move mlir compile to jit pipeline * Format * Update version * Add source locations * Format * Correctly add module * Format * Update failed tests * Fix failures when mlir is disabled * Format * Update mlir version * Check type for fp32 * Format * Remove failed test * Update mlir in driver * Tidy fixes * Foramt * Tidy fixes * Format * Fix const * Remove from requirements * Fix cmake version * Fix tidy warning * Use another ifdef * Fix tidy * Other tidy fix * Format * Update hash * Add missing license files * Format * Format * Fix fnction name
-
- 25 Jun, 2022 1 commit
-
-
Paul Fultz II authored
* Jit contiguous
-
- 22 Jun, 2022 1 commit
-
-
Ted Themistokleous authored
Updated each source file in the repo with the existing license.
-
- 10 Jun, 2022 1 commit
-
-
Paul Fultz II authored
Consolidate the vectorize and preload Add vectorization to reduction Co-authored-by:kahmed10 <15948690+kahmed10@users.noreply.github.com>
-