- 05 Feb, 2022 1 commit
-
-
Paul authored
-
- 28 Jan, 2022 1 commit
-
-
Paul Fultz II authored
* Enable auto vectorization * Handle vector types with convert function * Dont vectorize when it will cause problems with preload
-
- 27 Jan, 2022 1 commit
-
-
Umang Yadav authored
allow nonstd shape for the arg ops, non-standard shapes include broadcast, slice and transpose
-
- 26 Jan, 2022 1 commit
-
-
Paul authored
-
- 21 Jan, 2022 1 commit
-
-
Paul Fultz II authored
* Improve handling of generator expressions when getting the flags for hip
-
- 10 Jan, 2022 3 commits
-
-
Paul Fultz II authored
* Add matcher for conv_bias pointwise * Add fusion op
-
Paul authored
-
Paul authored
-
- 07 Jan, 2022 2 commits
- 06 Jan, 2022 3 commits
- 11 Dec, 2021 5 commits
- 09 Dec, 2021 1 commit
-
-
Shucai Xiao authored
Changed the number of threads in a block from 256 to 128 Increased the max number of blocks in the kernel from 256 to 1M. For the case that the axis is the last dimension, we removed the computation of index since it is not required. With these change, we can get about 2x speedup compared to the develop branch for the softmax op used in the BertSquad model.
-
- 08 Dec, 2021 1 commit
-
-
Paul Fultz II authored
-
- 07 Dec, 2021 1 commit
-
-
Paul Fultz II authored
simple variable rename
-
- 02 Dec, 2021 1 commit
-
-
Paul Fultz II authored
Fix pointwise compile error with half sqrt
-
- 01 Dec, 2021 4 commits
- 30 Nov, 2021 2 commits
-
-
turneram authored
Fix whitespace bug in fusable_conv matcher and add unit test
-
Paul Fultz II authored
-
- 24 Nov, 2021 3 commits
-
-
Paul authored
-
Paul authored
-
Paul Fultz II authored
* Check jit kernels files with clang-tidy
-
- 18 Nov, 2021 1 commit
-
-
Paul Fultz II authored
Do compilation in parallel
-
- 16 Nov, 2021 4 commits
- 11 Nov, 2021 1 commit
-
-
Paul Fultz II authored
This enables the pointwise fusions using the MIGRAPHX_ENABLE_POINTWISE_FUSION env variable. Its disabled by default since MIOpen fusions need to be refactored. This also adds a compile_ops pass to compile the pointwise modules. All tests except test_gpu_fast_math passes with MIGRAPHX_ENABLE_POINTWISE_FUSION=1 set.
-
- 09 Nov, 2021 2 commits
- 05 Nov, 2021 1 commit
-
-
kahmed10 authored
Moving our Docker file from ROCm 4.3 to 4.5 Add Navi base GPUs in to the CI infrastructure
-