- 16 Feb, 2023 1 commit
-
-
aska-0096 authored
-
- 14 Feb, 2023 1 commit
-
-
aska-0096 authored
-
- 11 Feb, 2023 1 commit
-
-
aska-0096 authored
-
- 09 Feb, 2023 1 commit
-
-
aska-0096 authored
-
- 03 Feb, 2023 1 commit
-
-
aska-0096 authored
-
- 30 Jan, 2023 1 commit
-
-
aska-0096 authored
-
- 19 Jan, 2023 1 commit
-
-
aska-0096 authored
-
- 18 Jan, 2023 3 commits
- 16 Jan, 2023 2 commits
-
-
aska-0096 authored
Merge branch 'develop' of https://github.com/ROCmSoftwarePlatform/composable_kernel into navi3x_multiD_batchedGEMM
-
aska-0096 authored
-
- 13 Jan, 2023 1 commit
-
-
aska-0096 authored
-
- 12 Jan, 2023 2 commits
-
-
Illia Silin authored
* add DEBUG_LOG macro to enable/disable debug output * fix syntax * fix syntax again * fix syntax one more time * remove balnk spaces * use ifdefs * add the Print argument * move the definition of DEBUG_LOG to ck.hpp * add the missign argument to Print()
-
Qianfeng authored
* Let cmath included when compiling host codes in math_v2.hpp * Remove including of cmath in device_base.hpp and device_permute.hpp
-
- 11 Jan, 2023 1 commit
-
-
aska-0096 authored
-
- 19 Dec, 2022 1 commit
-
-
- 15 Dec, 2022 10 commits
-
-
zjing14 authored
* add mnk padding, support m=0 * clean code * clean code Co-authored-by:Rostyslav Geyyer <46627076+geyyer@users.noreply.github.com>
-
Illia Silin authored
-
aska-0096 authored
-
aska-0096 authored
-
aska-0096 authored
-
aska-0096 authored
-
-
aska-0096 authored
-
Qianfeng authored
-
Rostyslav Geyyer authored
Add padding device_gemm_add_add_fastgelu_xdl_c_shuffle instances to enable arbitrary problem size (#535) * Add padding device_gemm_add_add_fastgelu_xdl_c_shuffle instances * Add padding device_gemm_add_fastgelu_xdl_c_shuffle instances * Add gemm_add_fastgelu profiler impl * Add padding device_gemm_fastgelu_xdl_c_shuffle instances * Add gemm_fastgelu profiler impl
-
- 14 Dec, 2022 1 commit
-
-
Rostyslav Geyyer authored
-
- 13 Dec, 2022 2 commits
- 12 Dec, 2022 2 commits
-
-
arai713 authored
* added 2d gridwise elementwise * added 2d version of device elementwise * added example file with updated device elementwise call * added Cmake file * changed NumDim into 2D * fixed compiler issues * fixed indexing for loop step * fixed NumDim dimension error * changed blockID to 2D * updated Grid Desc * updated kernel call * fixed 2d thread indexing * added dimensions for example file * commented out unused code * changed vector load * removed extra code * temporarily removing vector load on 2nd dim * changed vector load back, still causing errors * altered indexing * changed isSupportedArgument for 2D * changed indexing + do/while * fixed isSupportedArgument * changed dimension for debugging * fixed * added testing printouts * testing change * added variables to distribute threads through both dimensions * testing changes * integrated variable for thread distribution into device elementwise and added as parameter for gridwise elementwise * removed most of the extraneous code, testing with different dimensions * testing * removed debugging print statements * moved 2d elementwise permute into elementwise permute directory * fixed formatting * removed debugging comments from threadwise transfer Co-authored-by:
Jing Zhang <jizhan@amd.com> Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com>
-
aska-0096 authored
-
- 09 Dec, 2022 3 commits
- 08 Dec, 2022 1 commit
-
-
Illia Silin authored
* apply new K-dimension check in gemm_xdl_cshuffle * add K-dim check to gemm_xdl and batched_gemm_xdl * fix syntax * fix syntax * clean-up the debug output
-
- 07 Dec, 2022 3 commits
-
-
Po Yen Chen authored
* Use smaller tensor size in test * Use even more smaller tensor size * Touch only failing test case inputs
-
Rostyslav Geyyer authored
Co-authored-by:
Rosty Geyyer <rosty.geyyer@amd.com> Co-authored-by:
Chao Liu <chao.liu2@amd.com>
-
guangzlu authored
Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
- 06 Dec, 2022 1 commit
-
-
Illia Silin authored
* ignore .git folder when doing clang-format * fix syntax * add backslashes before quotes * add path filter for several extensions
-