- 09 Jun, 2022 2 commits
- 25 May, 2022 2 commits
- 24 May, 2022 2 commits
- 18 May, 2022 3 commits
- 06 May, 2022 2 commits
- 05 May, 2022 6 commits
- 13 Apr, 2022 2 commits
- 29 Mar, 2022 1 commit
-
-
Paul Fultz II authored
This adds the infrastructure so we can compile everything in parallel, whereas before only pointwise kernels were compiled in parallel. This will also directly integrate with lowering and the gpu-driver. The kernels for pointwise and roialign are using this infrastructure. Scatternd is not since it does require standard shape. This also makes it easier to add new runtime compiled kernels in the future.
-
- 28 Mar, 2022 1 commit
-
-
Paul Fultz II authored
* Use ccache for runtime compilation
-
- 18 Mar, 2022 1 commit
-
-
turneram authored
Add exclusive and reverse modes to gpu implementation of prefix_scan_sum, which completes support for ONNX op CumSum
-
- 15 Mar, 2022 1 commit
-
-
Paul Fultz II authored
This adds iterators to tensor_view, which can allow kernels to work with non-standard shapes like for roialign. To improve the performance of indexing when using the iterators, the shape class was updated to use integral_constants since the compiler doesn't always fold the const values. An integral_constant will at least enforce that in the AST. Finally, since index calculations with single integers are improved, I also updated pointwise to use single index rather than multi index. There is about 4% improvement in some cases.
-
- 14 Mar, 2022 1 commit
-
-
Shucai Xiao authored
change max number of groups in a kernel to 1B for greater performance
-
- 04 Mar, 2022 1 commit
-
-
bpickrel authored
Changed the pooling values for two structures from strings to specialized enum classes. Many test and operator parsing changes to support this. Introduces one new source file, op_enums.cpp.
-
- 03 Mar, 2022 3 commits
-
-
Paul Fultz II authored
Boost the max number of workgroups for pointwise ops by matching what we are doing in launch.hpp
-
kahmed10 authored
better performance doing it this way
-
turneram authored
Add onnx parser and ref and gpu implementations of ONNX op ScatterND
-
- 02 Mar, 2022 2 commits
-
-
Charlie Lin authored
Implements the IsNaN operator, ref, gpu, and onnx parser.
-
bpickrel authored
Update the base version of clang-format from 5.0 to 10.0
-
- 25 Feb, 2022 1 commit
-
-
Paul Fultz II authored
wrapped in a any_ptr class so the type can be checked at runtime for a mismatch.
-
- 24 Feb, 2022 1 commit
-
-
Paul Fultz II authored
Make doc/CMakeLists.txt standalone Switch to use rocm-cmake modules for document generation Add CONFIGURE_DEPENDS to file(GLOB) so it will update without an explicit cmake run Add STRINGS property for build type to make it easier to switch build types with ccmake Various fixes and improvements
-
- 21 Feb, 2022 2 commits
- 09 Feb, 2022 1 commit
-
-
Paul Fultz II authored
There is now a MIGRAPHX_DISABLE_POINTWISE_FUSION to disable it
-
- 08 Feb, 2022 2 commits
-
-
Paul Fultz II authored
This causes incorrect memory coloring, which was causing the accuracy failures in the vision model when enabling the pointwise fusions. Resnet50, inceptionv3 and inceptionv4 do verify now in the driver.
-
Paul Fultz II authored
Enforce types to avoid compilation error in pointwise fusions This fixes compile failure: gpt-2, fp16 on Navi
-
- 07 Feb, 2022 2 commits
- 05 Feb, 2022 1 commit
-
-
Paul authored
-