- 11 Oct, 2023 2 commits
-
-
Ted Themistokleous authored
* Fix scatter operator for nonstandard shapes remove standard() shape check for scatter inputs. * Add nostandard input tests for scatter --------- Co-authored-by:Chris Austen <causten@users.noreply.github.com>
-
Artur Wojcik authored
-
- 28 Sep, 2023 1 commit
-
-
Umang Yadav authored
MIGraphX verification by default uses normalized RMS error as the basis for the verification. This change adds some logic to allow migraphx to do "np.allclose" type of elementwise verification using atol and rtol. Commit also includes changes to consistently pass "gold" or "expected" results as the second argument for "verify_range()" calls. Default RMS tolerance inside driver is set to 0.001 which IMO is high for FP32 compared to what we had earlier. Need better defaults
-
- 27 Sep, 2023 1 commit
-
-
Paul Fultz II authored
-
- 21 Sep, 2023 1 commit
-
-
Paul Fultz II authored
-
- 15 Sep, 2023 1 commit
-
-
Umang Yadav authored
-
- 18 Aug, 2023 1 commit
-
-
Paul Fultz II authored
-
- 28 Jul, 2023 1 commit
-
-
Paul Fultz II authored
* Improve performance of pointwise/reduction kernels when using NHWC layouts * Format * Add nhwc test * Format * Remove inline namespace * Add reduce test
-
- 22 Jul, 2023 1 commit
-
-
Charlie Lin authored
Throwing on these calls catches dynamic shape errors earlier rather than having to backpedal from a bad call
-
- 13 Jul, 2023 1 commit
-
-
Charlie Lin authored
Renames deconvolution -> convolution_backwards to be more consistent with the literature Note: this is not the cross-correlation operator (which is the adjoint of convolution). This is technically a standard convolution operator combined with an upsampling operator rather than a downsampling operator. Adds unit tests for the padding, strides, dilations, and other op attributes. Throws on auto_pad attribute since it has not been implemented Previously it read the attribute and set it but then did nothing with it Extended for dynamic shapes Does not support using asymmetric padding (padding_L != padding_R) and output_shape with dynamic shapes.
-
- 10 Jul, 2023 1 commit
-
-
Brian Pickrell authored
Changes to the way Pooling operation calculates pooling when there's padding. Old code would clip off any padding values before computing; for instance if an Average pooling window contained 0 1 2 where the 0 is padding, the result was 1.5 instead of 1.0. See Issue 1766
-
- 08 Jul, 2023 1 commit
-
-
Artur Wojcik authored
-
- 06 Jul, 2023 1 commit
-
-
Artur Wojcik authored
-
- 02 Jul, 2023 1 commit
-
-
Paul Fultz II authored
Add a CI job to test CK Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK Continue tuning even when there is invalid configs Fix a bug with parallel compilation not using all available threads Add additional test for gemms using half types Removed int32 as supported type since it doesnt pass our test suite
-
- 23 Jun, 2023 1 commit
-
-
Umang Yadav authored
Fixes #1852 Fixes #1847
-
- 01 Jun, 2023 1 commit
-
-
Umang Yadav authored
By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy. By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.
-
- 25 May, 2023 1 commit
-
-
Ted Themistokleous authored
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf
-
- 20 May, 2023 1 commit
-
-
Umang Yadav authored
* use half hip functions to compute max and min * add verify test for min and max
-
- 04 May, 2023 1 commit
-
-
Paul Fultz II authored
When multiplying either the input or output across the K dimensions then the multiple can be applied to the constant which can then be folded with propagate_const.
-
- 28 Apr, 2023 1 commit
-
-
Charlie Lin authored
-
- 24 Apr, 2023 2 commits
-
-
Paul Fultz II authored
This fixes #1700
-
Paul Fultz II authored
-
- 07 Apr, 2023 1 commit
-
-
Paul Fultz II authored
Converts can be inserted when the scales and input differ in the onnx file(we are already doing this implicit conversion in the ref implementation). This will also improve the compile-time of quantizelinear.hpp since we can remove the nested visit method.
-
- 05 Apr, 2023 1 commit
-
-
Paul Fultz II authored
This will replace conv(x+a, w) with conv(x, w) + conv(a, w) where a is a constant so conv(a, w) can be replaced with a constant.
-
- 31 Mar, 2023 1 commit
-
-
Charlie Lin authored
Adds a new GPU compiler pass split_single_dyn_dim that handles when one input parameter has a single non-fixed dynamic_dimension. commonly occurs for dynamic batch or BERT sequence length Splits the dynamic shape into several submodules will static input parameters to handle all of the cases in the dynamic_dimension range. Essentially does what I manually did for the select_module verify tests Adds a compile option split_single_dyn_dim that toggles the pass on/off. Defaults to false. Updates verify_program.hpp and run_verify.cpp to allow for the tests to change the compile_options
-
- 29 Mar, 2023 1 commit
-
-
Paul Fultz II authored
-
- 21 Mar, 2023 1 commit
-
-
Charlie Lin authored
Refactor to have select_module use output parameters Disable select_module verify tests on cpu
-
- 18 Mar, 2023 1 commit
-
-
Umang Yadav authored
Fixes #1595
-
- 17 Mar, 2023 2 commits
-
-
Paul Fultz II authored
-
Paul Fultz II authored
This is the original testcase that sparked the error with missing proper const folding. Pushing changes up to this branch and closing out the PR #1622
-
- 10 Mar, 2023 2 commits
-
-
Paul Fultz II authored
-
Paul Fultz II authored
-
- 28 Feb, 2023 1 commit
-
-
Charlie Lin authored
Creates the select_module operator that selects one of the submodules passed to it to run based on the submodule parameters. The submodule is selected by having the exact same static shapes for the arguments to select_module as the parameters in the submodule
-
- 23 Feb, 2023 1 commit
-
-
shivadbhavsar authored
-
- 16 Feb, 2023 1 commit
-
-
Paul Fultz II authored
Avoids double global loads. Strided loops are unrolled which lets store results in array which compiler will use registers for since the index access is constant. Updated to handle large reductions so which results with a better stable diffusion result
-
- 17 Jan, 2023 1 commit
-
-
Paul Fultz II authored
-
- 13 Jan, 2023 1 commit
-
-
shivadbhavsar authored
This PR resolves the bug addressed in #1496.
-
- 11 Jan, 2023 1 commit
-
-
Paul Fultz II authored
* Use cosine to compute half sin
-
- 09 Jan, 2023 1 commit
-
-
Ted Themistokleous authored
JIT implementation of the gather operator Added a few more unit tests to this one as well since I saw some odd behavior during bring up.
-
- 02 Nov, 2022 1 commit
-
-
Paul Fultz II authored
-