- 14 Jun, 2023 2 commits
-
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
- 09 Jun, 2023 3 commits
-
-
Chris Austen authored
-
Umang Yadav authored
-
Umang Yadav authored
#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.
-
- 08 Jun, 2023 2 commits
-
-
Paul Fultz II authored
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
-
Chris Austen authored
-
- 06 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Umang Yadav authored
Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf
-
- 05 Jun, 2023 1 commit
-
-
Charlie Lin authored
Changed the doc for find_permutation(shape) to be more clear that it is finding the permutation that would make the shape standard
-
- 04 Jun, 2023 1 commit
-
-
Igor Mirosavljevic authored
-
- 02 Jun, 2023 1 commit
-
-
Chris Austen authored
-
- 01 Jun, 2023 1 commit
-
-
Umang Yadav authored
By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy. By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.
-
- 31 May, 2023 2 commits
-
-
Paul Fultz II authored
-
Umang Yadav authored
partially solves #1656 This PR only handles compilation part of multitarget.
-
- 30 May, 2023 2 commits
-
-
Paul Fultz II authored
Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists Shorten the names for variables from the main module Use prefix p_ for parameters Use shorter variable m for main module in python
-
Paul Fultz II authored
-
- 29 May, 2023 2 commits
-
-
Pavle Jacovic authored
-
Chris Austen authored
-
- 28 May, 2023 1 commit
-
-
Paul Fultz II authored
* Allow quantizing for both int8 and fp16
-
- 25 May, 2023 1 commit
-
-
Ted Themistokleous authored
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf
-
- 24 May, 2023 2 commits
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
kahmed10 authored
Refactor supported gfx archs
-
- 23 May, 2023 2 commits
-
-
Umang Yadav authored
back out changes for rocm-5.5
-
Djordje Petrovic authored
-
- 20 May, 2023 1 commit
-
-
Umang Yadav authored
* use half hip functions to compute max and min * add verify test for min and max
-
- 19 May, 2023 3 commits
-
-
Chris Austen authored
-
Zhuoran Yin authored
Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
Chris Austen authored
Co-authored-by:
Sam Wu <sam.wu2@amd.com> Co-authored-by:
Paul <pfultz2@yahoo.com>
-
- 18 May, 2023 1 commit
-
-
Umang Yadav authored
-
- 17 May, 2023 2 commits
-
-
Chris Austen authored
Move CI to support the rocm5.5 release
-
shivadbhavsar authored
Adding support for broadcasted scalars to unsqueeze op. Specifying steps other than 1 is disallowed in this implementation since we want the output the always be a tensor. We can support varying step sizes if we allow a broadcasted scalar output from this op.
-
- 11 May, 2023 1 commit
-
-
github-actions[bot] authored
Co-authored-by:causten <causten@users.noreply.github.com>
-
- 09 May, 2023 1 commit
-
-
Chris Austen authored
-
- 08 May, 2023 2 commits
-
-
Umang Yadav authored
-
Charlie Lin authored
Example of using the C++ API to run an ONNX model with dynamic batch
-
- 06 May, 2023 1 commit
-
-
Chris Austen authored
Remove various file not required for what we use Github runners for
-
- 05 May, 2023 3 commits
-
-
Charlie Lin authored
Python API with documentation updates
-
Manupa Karunaratne authored
Adds support for slice,transpose,contigous and reshape fusions into input tensors for a fused mlir kernel.
-
kahmed10 authored
add option to print tf supported ops sort both onnx and tf ops alphabetically
-