- 26 Jan, 2023 5 commits
- 19 Jan, 2023 1 commit
-
-
Paul Fultz II authored
This prevents multiple adds.
-
- 17 Jan, 2023 1 commit
-
-
Paul Fultz II authored
-
- 11 Jan, 2023 1 commit
-
-
Paul Fultz II authored
* Use cosine to compute half sin
-
- 09 Jan, 2023 1 commit
-
-
Ted Themistokleous authored
JIT implementation of the gather operator Added a few more unit tests to this one as well since I saw some odd behavior during bring up.
-
- 11 Dec, 2022 1 commit
-
-
Umang Yadav authored
HIP had change in previous rocm releases to use --offload-arch instead of --cuda-gpu-arch. This should be backwards compatbile. hipRTC also supports --offload-arch.
-
- 07 Dec, 2022 1 commit
-
-
Paul Fultz II authored
* Add implicit_conversion
-
- 06 Dec, 2022 2 commits
-
-
Ted Themistokleous authored
Need this for when we debug and use MIGRAPHX_TRACE_EVAL() to show tuples Without this we break when reading our buffer due to the use of visit() This came up as part of #1283 debugging.
-
jungpark-mlir authored
Update dialect registration interface Update 2nd build pipeline call and use full arch name
-
- 29 Nov, 2022 1 commit
-
-
kahmed10 authored
Merging #1391 caused an extra adjust allocation pass for GPU targets. This removes that merge error.
-
- 20 Nov, 2022 1 commit
-
-
Paul Fultz II authored
-
- 18 Nov, 2022 1 commit
-
-
Umang Yadav authored
Disabling it untill int8 fix is in mainline from MIOpen and also so that QA tests could run migraphx-driver and unittests from MIGraphX.
-
- 17 Nov, 2022 9 commits
- 07 Nov, 2022 1 commit
-
-
arvindcheru authored
-
- 06 Nov, 2022 1 commit
-
-
Umang Yadav authored
-
- 02 Nov, 2022 2 commits
-
-
Paul Fultz II authored
Can be enabled via environment variable MIGRAPHX_ENABLE_NHWC
-
Paul Fultz II authored
-
- 28 Oct, 2022 1 commit
-
-
Umang Yadav authored
Local Threads of multiples 32 were introduced in #1348 But LocalThreads that are not multiple of 64 are causing correctness issues.
-
- 27 Oct, 2022 2 commits
-
-
Chris Austen authored
Upgraded Dockerfiles and fixed tidy issues to make Ubuntu 20.04 and ROCm 5.3.0 the default
-
kahmed10 authored
updated GPU pad to now use JIT version. added range functions for JIT kernels.
-
- 26 Oct, 2022 1 commit
-
-
Brian Pickrell authored
Fixes an observed regression error on certain Frozen Protobuf models due to PR 1280
-
- 24 Oct, 2022 1 commit
-
-
jungpark-mlir authored
Reiterate the assertion on the standard shape but relax it for the multibroadcast ops deliberately inserted to explicit the broadcast.
-
- 19 Oct, 2022 2 commits
-
-
Charlie Lin authored
Refactor dynamic compute - add a compute_output_shape object that implicitly converts to a new dyn_output or shape object - dyn_output object can handle computing the static output shape of an operator given the input arguments shapes change an operator's compute function to argument compute(const dyn_output& dyn_out, std::vector<argument> args) to use dyn_output object Dynamic ref unary functions - Included these changes to have an example of the refactored dynamic compute being used - Changes to unary base class to handle dynamic shapes - Changed elu and leaky_relu to use unary base class and pointwise JIT
-
Umang Yadav authored
* use find2.0 for the convolution Co-authored-by:
Vasilii Filippov <DrizztDoUrden@users.noreply.github.com> Co-authored-by:
Chris Austen <causten@users.noreply.github.com>
-
- 18 Oct, 2022 1 commit
-
-
Paul Fultz II authored
* Enable non-standard shape * Use perfdb for non xdlops * Fix transpose+broadcast strides Co-authored-by:jungpark-mlir <jungwook.park@amd.com>
-
- 13 Oct, 2022 1 commit
-
-
Charlie Lin authored
Rewrites the TF batch norm like operators to other MIGX operators Removes the code related to batch_norm_inference
-
- 04 Oct, 2022 2 commits
-
-
Ted Themistokleous authored
Stream sync changes and associated API level changes
-
Paul Fultz II authored
optimize the softmax operator
-