- 22 Jul, 2023 1 commit
-
-
kahmed10 authored
-
- 21 Jul, 2023 1 commit
-
-
Umang Yadav authored
HIP requires global work items in multiple of local work items. If it is not it is not guaranteed to generate correct results all the time. Fixes #1977 Fixes #1644 MIGraphX CI has moved to rocm-5.6 which doesn't require hipRTC workarounds
-
- 19 Jul, 2023 1 commit
-
-
Umang Yadav authored
-
- 18 Jul, 2023 1 commit
-
-
Umang Yadav authored
Fixes #1946
-
- 17 Jul, 2023 2 commits
-
-
Chris Austen authored
* add support for rocm 5.6 in CI * Disable anonymous namespace check * add default c'tors to avoid warnings
-
Krzysztof Drewniak authored
This commit removes the build options to disable threading and removes the mutex in compile_mlir. The commit being tested is a draft PR on rocMLIR that'll get merged if this passes
-
- 13 Jul, 2023 2 commits
-
-
Krzysztof Drewniak authored
Allows the rocMLIR CI (which builds rocMLIR tip against MIGraphX tip) to pass.
-
Charlie Lin authored
Renames deconvolution -> convolution_backwards to be more consistent with the literature Note: this is not the cross-correlation operator (which is the adjoint of convolution). This is technically a standard convolution operator combined with an upsampling operator rather than a downsampling operator. Adds unit tests for the padding, strides, dilations, and other op attributes. Throws on auto_pad attribute since it has not been implemented Previously it read the attribute and set it but then did nothing with it Extended for dynamic shapes Does not support using asymmetric padding (padding_L != padding_R) and output_shape with dynamic shapes.
-
- 11 Jul, 2023 1 commit
-
-
Umang Yadav authored
* do not use int8x4 format for the rocblas
-
- 08 Jul, 2023 2 commits
-
-
Artur Wojcik authored
-
Artur Wojcik authored
Export API symbols for migraphx, migraphx_ref, migraphx_cpu, migrphx_gpu, migraphx_device, migraphx_tf, and migraphx_onnx. There is a separate PR for migrahx_c. API symbol exporting affects only Windows. It is transparent on Linux.
-
- 06 Jul, 2023 1 commit
-
-
Paul Fultz II authored
This will also annotate the function with the block size so the compiler can do a better job of optimizing.
-
- 05 Jul, 2023 1 commit
-
-
Umang Yadav authored
Needed to run multi-targeted program where "main" isn't the only root module. There could be many root modules other than main.
-
- 02 Jul, 2023 1 commit
-
-
Paul Fultz II authored
Add a CI job to test CK Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK Continue tuning even when there is invalid configs Fix a bug with parallel compilation not using all available threads Add additional test for gemms using half types Removed int32 as supported type since it doesnt pass our test suite
-
- 29 Jun, 2023 1 commit
-
-
Krzysztof Drewniak authored
Bump MLIR commit to include latest supported pointwise ops. Expand the MLIR approve list Ensure that operations such as tanh() that don't have integer implementations (at least in MLIR) aren't used within MLIR modules. Add additional tests.
-
- 28 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Krzysztof Drewniak authored
Update `mlir_program` to only create one dialect registry, and to call registerRocMLIRPasses() (which is needed and may not be thread-safe) exactly once. In addition, use a single thread pool across all contexts. This is recommended practice upstream for libraries that perform a lot of compile jobs, and saves on the overhead of creating and destroying a lot of threads
-
- 22 Jun, 2023 1 commit
-
-
Zhuoran Yin authored
Add mlir quant_dot operator support
-
- 21 Jun, 2023 1 commit
-
-
Umang Yadav authored
Co-authored-by:kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
- 17 Jun, 2023 1 commit
-
-
turneram authored
* Add initial ck_gemm code * Format * Add additional src files * Format * Add include * Simplify fuse_ck * Format * Rename var * Enable pass * Update ck version * Fix include * Add group stride * Disable warnings for ck headers * Format * Add unpack array * Add interface to enable tuning * Format * Update compile_ops to handle tuning config * Format * Add some comments * Move time_op to migraphx_gpu * Add banchmarking * Refactor * Format * Add lift class macro * Use device name * Format * Generate configs * Format * Pass tuning parameter * Move data type to is_ck_gemm matcher * Format * Add problem_cache to avoid retuning same configs * Format * Format * Mark the problems * Format * Use is_null * Format * Resize vector * Only tune with exaustive tuning * Format * Use assert * FOrmat * Tidy fixes * More tidy fixes * Format * Add license to missing files * Format * Use transform * Format * Fix tidy * Format * Fix cppcheck issues * Format * Add static_assert * Add ops header * Add assertion in batcher * Format * Improve the batch fold check * Format * Add where op workaround for CK * Skip if any input is not a supported ck type * Format * Check batch is standard * Format * Remove redundant static keyword * Update commit hash * Fix error when running without --exhaustive-tune * Formatting * Formatting * Remove fuse_ck_gemm_softmax_gemm * Update ck hash * Correct spelling mistake * Remove commented out logic from fuse_ck * Remove unused include and add comment * Formatting * Remove redundant get_shape and remove ck_gemm from names * Formatting * Allow for mixed types with int8 gemms * Formatting * Add back find_package from merge * Update CK commit hash and add gfx940 to fuse_ops supported archs * Formatting * Update CK hash
-
- 15 Jun, 2023 1 commit
-
-
Umang Yadav authored
-
- 14 Jun, 2023 1 commit
-
-
Umang Yadav authored
* add fix for the trace_eval * Add throw for the debug builds * Formatting --------- Co-authored-by:Chris Austen <causten@users.noreply.github.com>
-
- 09 Jun, 2023 2 commits
-
-
Chris Austen authored
-
Umang Yadav authored
#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.
-
- 08 Jun, 2023 2 commits
-
-
Paul Fultz II authored
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
-
Chris Austen authored
-
- 06 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Umang Yadav authored
Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf
-
- 31 May, 2023 1 commit
-
-
Umang Yadav authored
partially solves #1656 This PR only handles compilation part of multitarget.
-
- 24 May, 2023 2 commits
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
kahmed10 authored
Refactor supported gfx archs
-
- 23 May, 2023 1 commit
-
-
Umang Yadav authored
back out changes for rocm-5.5
-
- 20 May, 2023 1 commit
-
-
Umang Yadav authored
* use half hip functions to compute max and min * add verify test for min and max
-
- 19 May, 2023 1 commit
-
-
Zhuoran Yin authored
Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 17 May, 2023 1 commit
-
-
Chris Austen authored
Move CI to support the rocm5.5 release
-
- 08 May, 2023 1 commit
-
-
Umang Yadav authored
-
- 05 May, 2023 1 commit
-
-
Manupa Karunaratne authored
Adds support for slice,transpose,contigous and reshape fusions into input tensors for a fused mlir kernel.
-
- 04 May, 2023 1 commit
-
-
Zhuoran Yin authored
Exposed the mlir_enabled() call the decide for lowering pipeline's enablement Disabled the rewrite quantization pipeline in mlir compilation Added quant convolution as anchor ops Fixed the return type expectations Added the fall back hip implementation for quantizelinear and dequantizelinear Will need advises to improve the implementation for quantizelinear
-
- 28 Apr, 2023 1 commit
-
-
Charlie Lin authored
-
- 25 Apr, 2023 1 commit
-
-
kahmed10 authored
update rocBLAS version check to support 3.0 and above with simplified logic
-