- 01 Mar, 2022 3 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 28 Feb, 2022 8 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 26 Feb, 2022 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 25 Feb, 2022 3 commits
-
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
Paul Fultz II authored
wrapped in a any_ptr class so the type can be checked at runtime for a mismatch.
-
- 24 Feb, 2022 1 commit
-
-
Paul Fultz II authored
Make doc/CMakeLists.txt standalone Switch to use rocm-cmake modules for document generation Add CONFIGURE_DEPENDS to file(GLOB) so it will update without an explicit cmake run Add STRINGS property for build type to make it easier to switch build types with ccmake Various fixes and improvements
-
- 09 Feb, 2022 5 commits
-
-
Paul Fultz II authored
There is now a MIGRAPHX_DISABLE_POINTWISE_FUSION to disable it
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
- 08 Feb, 2022 5 commits
-
-
Paul Fultz II authored
This causes incorrect memory coloring, which was causing the accuracy failures in the vision model when enabling the pointwise fusions. Resnet50, inceptionv3 and inceptionv4 do verify now in the driver.
-
Paul Fultz II authored
Enforce types to avoid compilation error in pointwise fusions This fixes compile failure: gpt-2, fp16 on Navi
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
- 04 Feb, 2022 2 commits
-
-
Khalique Ahmed authored
-
Khalique Ahmed authored
-
- 31 Jan, 2022 1 commit
-
-
Khalique Ahmed authored
-
- 28 Jan, 2022 1 commit
-
-
Paul Fultz II authored
* Enable auto vectorization * Handle vector types with convert function * Dont vectorize when it will cause problems with preload
-
- 27 Jan, 2022 1 commit
-
-
Umang Yadav authored
allow nonstd shape for the arg ops, non-standard shapes include broadcast, slice and transpose
-
- 21 Jan, 2022 1 commit
-
-
Paul Fultz II authored
* Improve handling of generator expressions when getting the flags for hip
-
- 10 Jan, 2022 1 commit
-
-
Paul Fultz II authored
* Add matcher for conv_bias pointwise * Add fusion op
-
- 09 Dec, 2021 1 commit
-
-
Shucai Xiao authored
Changed the number of threads in a block from 256 to 128 Increased the max number of blocks in the kernel from 256 to 1M. For the case that the axis is the last dimension, we removed the computation of index since it is not required. With these change, we can get about 2x speedup compared to the develop branch for the softmax op used in the BertSquad model.
-
- 08 Dec, 2021 1 commit
-
-
Paul Fultz II authored
-
- 07 Dec, 2021 1 commit
-
-
Paul Fultz II authored
simple variable rename
-
- 02 Dec, 2021 1 commit
-
-
Paul Fultz II authored
Fix pointwise compile error with half sqrt
-
- 30 Nov, 2021 2 commits
-
-
turneram authored
Fix whitespace bug in fusable_conv matcher and add unit test
-
Paul Fultz II authored
-