- 16 Sep, 2023 1 commit
-
-
Paul Fultz II authored
-
- 13 Sep, 2023 1 commit
-
-
Paul Fultz II authored
-
- 12 Sep, 2023 1 commit
-
-
Paul Fultz II authored
-
- 10 Aug, 2023 1 commit
-
-
Krzysztof Drewniak authored
This PR constitutes the MIGraphX-side changes needed to not break the build in the presence of ROCmSoftwarePlatform/rocMLIR#1136 , and updates what data is sent in to MLIR during the kernel generation and tuning process.
-
- 09 Aug, 2023 1 commit
-
-
Paul Fultz II authored
-
- 08 Aug, 2023 1 commit
-
-
Paul Fultz II authored
-
- 30 Jul, 2023 1 commit
-
-
Paul Fultz II authored
* Add initial tuning support * Format * Add extra param * Format * Use exauhstive flag * Format * Set expected shapes * Format * Format * Fix missing symbol * Format * Add missing license header * Format * Update src/targets/gpu/include/migraphx/gpu/mlir.hpp
-
- 19 Jul, 2023 1 commit
-
-
Umang Yadav authored
-
- 18 Jul, 2023 1 commit
-
-
Umang Yadav authored
Fixes #1946
-
- 13 Jul, 2023 1 commit
-
-
Charlie Lin authored
Renames deconvolution -> convolution_backwards to be more consistent with the literature Note: this is not the cross-correlation operator (which is the adjoint of convolution). This is technically a standard convolution operator combined with an upsampling operator rather than a downsampling operator. Adds unit tests for the padding, strides, dilations, and other op attributes. Throws on auto_pad attribute since it has not been implemented Previously it read the attribute and set it but then did nothing with it Extended for dynamic shapes Does not support using asymmetric padding (padding_L != padding_R) and output_shape with dynamic shapes.
-
- 08 Jul, 2023 1 commit
-
-
Artur Wojcik authored
Export API symbols for migraphx, migraphx_ref, migraphx_cpu, migrphx_gpu, migraphx_device, migraphx_tf, and migraphx_onnx. There is a separate PR for migrahx_c. API symbol exporting affects only Windows. It is transparent on Linux.
-
- 05 Jul, 2023 1 commit
-
-
Umang Yadav authored
Needed to run multi-targeted program where "main" isn't the only root module. There could be many root modules other than main.
-
- 02 Jul, 2023 1 commit
-
-
Paul Fultz II authored
Add a CI job to test CK Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK Continue tuning even when there is invalid configs Fix a bug with parallel compilation not using all available threads Add additional test for gemms using half types Removed int32 as supported type since it doesnt pass our test suite
-
- 08 Jun, 2023 1 commit
-
-
Paul Fultz II authored
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
-
- 24 May, 2023 1 commit
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
- 17 May, 2023 1 commit
-
-
Chris Austen authored
Move CI to support the rocm5.5 release
-
- 04 May, 2023 1 commit
-
-
Zhuoran Yin authored
Exposed the mlir_enabled() call the decide for lowering pipeline's enablement Disabled the rewrite quantization pipeline in mlir compilation Added quant convolution as anchor ops Fixed the return type expectations Added the fall back hip implementation for quantizelinear and dequantizelinear Will need advises to improve the implementation for quantizelinear
-
- 24 Apr, 2023 1 commit
-
-
Charlie Lin authored
Updates the hip::copy_to_gpu and hip::copy_from_gpu operators to work with dynamic shapes Allows for offload_copy to be used with dynamic batch Changed assert in select_module because the argument might now be smaller with how offload_copy will work with dynamic batch. (maximum buffer size will be used)
-
- 06 Apr, 2023 1 commit
-
-
Paul Fultz II authored
Automatically fuse multiple reductions and pointwise operations.
-
- 05 Apr, 2023 1 commit
-
-
Paul Fultz II authored
This will replace conv(x+a, w) with conv(x, w) + conv(a, w) where a is a constant so conv(a, w) can be replaced with a constant.
-
- 30 Mar, 2023 1 commit
-
-
Paul Fultz II authored
* Add hiprtc driver
-
- 28 Mar, 2023 1 commit
-
-
Umang Yadav authored
* Remove version from check_context and bump program version
-
- 21 Mar, 2023 1 commit
-
-
Charlie Lin authored
Refactor to have select_module use output parameters Disable select_module verify tests on cpu
-
- 18 Mar, 2023 1 commit
-
-
Umang Yadav authored
Fixes #1595
-
- 01 Mar, 2023 1 commit
-
-
Charlie Lin authored
Add additional documentation to explain the passes.
-
- 16 Feb, 2023 1 commit
-
-
Umang Yadav authored
* Add driver flag "--exhaustive-tune" to enable tuning, add support for the same in C/C++ and python API
-
- 14 Feb, 2023 1 commit
-
-
shivadbhavsar authored
Currently, we default to device 0 when loading programs. Updating this to use hipGetDevice to set the device for the loaded program.
-
- 06 Feb, 2023 1 commit
-
-
Paul Fultz II authored
* Fuse layernorm with different patterns * Only match when using the last axis Co-authored-by:
kahmed10 <15948690+kahmed10@users.noreply.github.com> Co-authored-by:
kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
- 07 Dec, 2022 1 commit
-
-
Paul Fultz II authored
* Add implicit_conversion
-
- 07 Nov, 2022 1 commit
-
-
arvindcheru authored
-
- 02 Nov, 2022 1 commit
-
-
Paul Fultz II authored
Can be enabled via environment variable MIGRAPHX_ENABLE_NHWC
-
- 19 Oct, 2022 2 commits
-
-
Charlie Lin authored
Refactor dynamic compute - add a compute_output_shape object that implicitly converts to a new dyn_output or shape object - dyn_output object can handle computing the static output shape of an operator given the input arguments shapes change an operator's compute function to argument compute(const dyn_output& dyn_out, std::vector<argument> args) to use dyn_output object Dynamic ref unary functions - Included these changes to have an example of the refactored dynamic compute being used - Changes to unary base class to handle dynamic shapes - Changed elu and leaky_relu to use unary base class and pointwise JIT
-
Umang Yadav authored
* use find2.0 for the convolution Co-authored-by:
Vasilii Filippov <DrizztDoUrden@users.noreply.github.com> Co-authored-by:
Chris Austen <causten@users.noreply.github.com>
-
- 18 Oct, 2022 1 commit
-
-
Paul Fultz II authored
* Enable non-standard shape * Use perfdb for non xdlops * Fix transpose+broadcast strides Co-authored-by:jungpark-mlir <jungwook.park@amd.com>
-
- 13 Oct, 2022 1 commit
-
-
Charlie Lin authored
Rewrites the TF batch norm like operators to other MIGX operators Removes the code related to batch_norm_inference
-
- 04 Oct, 2022 1 commit
-
-
Ted Themistokleous authored
Stream sync changes and associated API level changes
-
- 29 Sep, 2022 1 commit
-
-
Umang Yadav authored
Improvements/Additions to be made: changes for the quant_convolution, changes for the deconvolution, Macros for MIOpen status checks
-
- 28 Sep, 2022 1 commit
-
-
Umang Yadav authored
test_gpu_pack_int8_args fails on gfx908 machine, because it doesn't set compute_fp32 flag correctly. This PR fixes the test such that it checks for the device-name, and rocblas-versions and sets this flag accordingly.
-
- 26 Sep, 2022 1 commit
-
-
Paul Fultz II authored
-
- 23 Sep, 2022 1 commit
-
-
Paul Fultz II authored
* Remove device functions * Update tests
-