- 07 Jun, 2023 9 commits
-
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
- 06 Jun, 2023 8 commits
-
-
Alan Turner authored
Merge branch 'ck-integration-tuning' of https://github.com/ROCmSoftwarePlatform/AMDMIGraphX into ck-integration-tuning
-
Alan Turner authored
-
Alan Turner authored
-
Alan Turner authored
-
Chris Austen authored
-
Umang Yadav authored
-
Chris Austen authored
-
Umang Yadav authored
Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf
-
- 05 Jun, 2023 1 commit
-
-
Charlie Lin authored
Changed the doc for find_permutation(shape) to be more clear that it is finding the permutation that would make the shape standard
-
- 04 Jun, 2023 1 commit
-
-
Igor Mirosavljevic authored
-
- 02 Jun, 2023 8 commits
-
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
Alan Turner authored
-
Chris Austen authored
-
- 01 Jun, 2023 8 commits
-
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul Fultz II authored
-
Umang Yadav authored
By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy. By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.
-
- 31 May, 2023 5 commits
-
-
Paul Fultz II authored
-
Paul authored
-
Paul authored
-
Paul authored
-
Paul authored
-