- 02 Jun, 2022 2 commits
- 25 May, 2022 3 commits
- 24 May, 2022 4 commits
-
-
Paul Fultz II authored
* Improve applicable batched gemms for bert
-
Paul Fultz II authored
Remove std references in runtime compilation since these are not available when using hiprtc and the headers may not be available on the system
-
Paul Fultz II authored
* Fuse gemm add with pointwise fusions
-
shivadbhavsar authored
As described in #1196, the ONNX mean parser does not work correctly for integral types. This update fixes the issue by handling integral types separately, where summation is performed before division. Additional test cases have also been added for handling integral types.
-
- 23 May, 2022 2 commits
- 20 May, 2022 2 commits
-
-
kahmed10 authored
For clarity on kernel names found when profiling. The new names are set to the order of the ops being compiled. For example: add + relu = add_relu_kernel.
-
Paul Fultz II authored
-
- 19 May, 2022 1 commit
-
-
Paul authored
-
- 18 May, 2022 1 commit
-
-
Paul authored
-
- 17 May, 2022 3 commits
-
-
Paul authored
-
Paul authored
-
shivadbhavsar authored
Updated variable names according to #1193
-
- 12 May, 2022 3 commits
- 11 May, 2022 5 commits
-
-
Paul Fultz II authored
Fuse layernorm and added triadd_layernorm fusion. This is a prep performance booster
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
- 10 May, 2022 3 commits
-
-
Paul authored
-
Paul authored
-
Umang Yadav authored
Expose add_literal method in C/C++ api
-
- 09 May, 2022 1 commit
-
-
Paul Fultz II authored
Improves performance for add_gelu. In bert it is 4x faster and for mul_add it is 50% faster than what we current have.
-
- 06 May, 2022 1 commit
-
-
Chris Austen authored
Move to CI containers to rocm 5.0.2 upgrade to 20.04 free up some more file space in github action environments
-
- 05 May, 2022 1 commit
-
-
Paul Fultz II authored
Fixes the #error when using cppcheck. This no longer suppresses cppcheck errors when including those errors. This fixes the cppcheck errors that was there already.
-
- 03 May, 2022 7 commits
- 29 Apr, 2022 1 commit
-
-
turneram authored
Add ref and gpu implementations for ONNX op GatherND Resolves #1032
-