- 02 Jun, 2022 1 commit
-
-
turneram authored
-
- 31 May, 2022 1 commit
-
-
turneram authored
-
- 26 May, 2022 1 commit
-
-
Paul Fultz II authored
* Upgrade to cppcheck 2.8
-
- 25 May, 2022 4 commits
- 24 May, 2022 3 commits
-
-
Paul Fultz II authored
* Improve applicable batched gemms for bert
-
Paul Fultz II authored
Remove std references in runtime compilation since these are not available when using hiprtc and the headers may not be available on the system
-
Paul Fultz II authored
* Fuse gemm add with pointwise fusions
-
- 23 May, 2022 2 commits
- 20 May, 2022 6 commits
- 19 May, 2022 1 commit
-
-
Paul authored
-
- 18 May, 2022 1 commit
-
-
Paul authored
-
- 17 May, 2022 9 commits
- 12 May, 2022 3 commits
- 11 May, 2022 5 commits
-
-
Paul Fultz II authored
Fuse layernorm and added triadd_layernorm fusion. This is a prep performance booster
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
- 10 May, 2022 2 commits
- 09 May, 2022 1 commit
-
-
Paul Fultz II authored
Improves performance for add_gelu. In bert it is 4x faster and for mul_add it is 50% faster than what we current have.
-