"src/targets/cpu/migemm.cpp" did not exist on "3b1dc9772fc9cb613318952adecaf96db699dab1"
- 15 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Brian Pickrell authored
* fix parse_instancenorm to create broadcast and multibroadcast instructions with two dynamic shape arguments instead of 1. Their make_op() functions don't support dynamic shapes when called with one input. This caused an error when parsing an ONNX 3duunet model * Use add_common_op() to create multibroadcast op. * add verification and parsing test for instance_norm with dynamic input. Parse test doesn't pass. * fix for test; still doesn't pass * another fix for test; still doesn't pass * work in progress, instance_norm_dyn_batch_test works but instance_norm_test doesn't * fix onnx instancenorm tests to match parser changes. Passes all check tests * Updated comments explaining usage of add_common_op() * hand-merged conflicts with develop * fix instance_norm_half_test after merge * add Onnx test instance_norm_dyn_batch_half_test * add shape test cases broadcast_1in_dyn_error and multibroadcast_1in_dyn_error_0
-
- 14 Jun, 2023 2 commits
-
-
Umang Yadav authored
* add fix for the trace_eval * Add throw for the debug builds * Formatting --------- Co-authored-by:Chris Austen <causten@users.noreply.github.com>
-
Umang Yadav authored
-
- 13 Jun, 2023 1 commit
-
-
Charlie Lin authored
-
- 12 Jun, 2023 1 commit
-
-
Paul Fultz II authored
-
- 09 Jun, 2023 3 commits
-
-
Chris Austen authored
-
Umang Yadav authored
-
Umang Yadav authored
#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.
-
- 08 Jun, 2023 2 commits
-
-
Paul Fultz II authored
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
-
Chris Austen authored
-
- 06 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Umang Yadav authored
Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf
-
- 05 Jun, 2023 1 commit
-
-
Charlie Lin authored
Changed the doc for find_permutation(shape) to be more clear that it is finding the permutation that would make the shape standard
-
- 04 Jun, 2023 1 commit
-
-
Igor Mirosavljevic authored
-
- 02 Jun, 2023 1 commit
-
-
Chris Austen authored
-
- 01 Jun, 2023 1 commit
-
-
Umang Yadav authored
By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy. By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.
-
- 31 May, 2023 2 commits
-
-
Paul Fultz II authored
-
Umang Yadav authored
partially solves #1656 This PR only handles compilation part of multitarget.
-
- 30 May, 2023 2 commits
-
-
Paul Fultz II authored
Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists Shorten the names for variables from the main module Use prefix p_ for parameters Use shorter variable m for main module in python
-
Paul Fultz II authored
-
- 29 May, 2023 2 commits
-
-
Pavle Jacovic authored
-
Chris Austen authored
-
- 28 May, 2023 1 commit
-
-
Paul Fultz II authored
* Allow quantizing for both int8 and fp16
-
- 25 May, 2023 1 commit
-
-
Ted Themistokleous authored
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf
-
- 24 May, 2023 2 commits
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
kahmed10 authored
Refactor supported gfx archs
-
- 23 May, 2023 2 commits
-
-
Umang Yadav authored
back out changes for rocm-5.5
-
Djordje Petrovic authored
-
- 20 May, 2023 1 commit
-
-
Umang Yadav authored
* use half hip functions to compute max and min * add verify test for min and max
-
- 19 May, 2023 3 commits
-
-
Chris Austen authored
-
Zhuoran Yin authored
Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
Chris Austen authored
Co-authored-by:
Sam Wu <sam.wu2@amd.com> Co-authored-by:
Paul <pfultz2@yahoo.com>
-
- 18 May, 2023 1 commit
-
-
Umang Yadav authored
-
- 17 May, 2023 2 commits
-
-
Chris Austen authored
Move CI to support the rocm5.5 release
-
shivadbhavsar authored
Adding support for broadcasted scalars to unsqueeze op. Specifying steps other than 1 is disallowed in this implementation since we want the output the always be a tensor. We can support varying step sizes if we allow a broadcasted scalar output from this op.
-
- 11 May, 2023 1 commit
-
-
github-actions[bot] authored
Co-authored-by:causten <causten@users.noreply.github.com>
-
- 09 May, 2023 1 commit
-
-
Chris Austen authored
-
- 08 May, 2023 2 commits
-
-
Umang Yadav authored
-
Charlie Lin authored
Example of using the C++ API to run an ONNX model with dynamic batch
-