- 26 Apr, 2023 3 commits
- 25 Apr, 2023 2 commits
- 24 Apr, 2023 5 commits
-
-
Adam Osewski authored
* simplify karg in device/grid split-k op * fix mk_kn_mn instances * add more instances * B2C with 3D grid for KSplit * Remove unused code. * Use default B2C (3D grid) in grid gemm v2r4r2. * Device gemm splitk use B2C map. * Device GroupedGemmXdlSplitKCShuffle * Example for GroupedGemm Xdl SplitK * Introduce Device GroupedGemmSplitK * Fix updating kbatch size. * Add instance mk-nk-mn * Enable set kbatch in profiler. * Add GGemmSplitK mk-kn-mn instances * Add more instances & split into multiple files. * minor fix * tuning * clean * disabled failed instances * use pipe v2 * Ignore arg on not supported arch. * fix warning --------- Co-authored-by:
carlushuang <carlus.huang@amd.com> Co-authored-by:
Adam Osewski <aosewski@amd.com> Co-authored-by:
zjing14 <zhangjing14@gmail.com> Co-authored-by:
Jing Zhang <jizhan@amd.com> Co-authored-by:
root <root@ctr-ubbsmc15.amd.com>
-
zjing14 authored
-
rocking authored
* [What] Remove pure conv int8 instance [Why] We will never use pure int8 conv in AI, use int8 quantization instead * Change layout * Share the kernel parameter * Support more type of NHWGC for group conv * Revise client example of conv 2d, use NHWGC layout * Add instance to cmake * Revise layout of group conv quantization instance * Revise layout of external api of group conv quantization * Revise layout of group conv quantization client example * Fix clang format * Add comment to describe meaning of each parameter
-
ltqin authored
-
ltqin authored
-
- 22 Apr, 2023 1 commit
-
-
Illia Silin authored
* simplify karg in device/grid split-k op * fix mk_kn_mn instances * add more instances * use name from tensor layout --------- Co-authored-by:carlushuang <carlus.huang@amd.com>
-
- 21 Apr, 2023 3 commits
-
-
Illia Silin authored
* switch to the new rocm5.6 compiler and docker * fix syntax
-
ltqin authored
-
Sam Wu authored
Co-authored-by:samjwu <samjwu@users.noreply.github.com>
-
- 20 Apr, 2023 1 commit
-
-
ltqin authored
-
- 19 Apr, 2023 2 commits
- 18 Apr, 2023 3 commits
-
-
Illia Silin authored
* enable use of rocm5.5 release candidate 4 * upgrade to ROCM5.5 RC5 * try fix the PUB_KEY error, remove the cmake-data package * upgrade to latest cmake version * use private dockerhub repo for rocm5.5 rc5 * add missing bracket
-
ltqin authored
-
ltqin authored
-
- 17 Apr, 2023 2 commits
-
-
rocking5566 authored
-
ltqin authored
-
- 16 Apr, 2023 2 commits
-
-
Haocong WANG authored
-
Rostyslav Geyyer authored
Co-authored-by:Rosty Geyyer <rosty.geyyer@amd.com>
-
- 13 Apr, 2023 1 commit
-
-
ltqin authored
-
- 12 Apr, 2023 3 commits
-
-
ltqin authored
-
ltqin authored
Merge branch 'lib_gemm_softmax_gemm_type' of https://github.com/ROCmSoftwarePlatform/composable_kernel into lib_gemm_softmax_gemm_type
-
ltqin authored
-
- 11 Apr, 2023 5 commits
-
-
Haocong WANG authored
-
-
Sam Wu authored
-
zjing14 authored
Co-authored-by:root <root@ctr-ubbsmc15.amd.com>
-
zjing14 authored
* add a marco to turn off denorm fix by default * expose the marco --------- Co-authored-by:root <root@ctr-ubbsmc15.amd.com>
-
- 10 Apr, 2023 2 commits
-
-
zjing14 authored
-
rocking5566 authored
* Rename to proper naming * Add example of groupnorm + swish * Extract duplicate code in example * Add groupnorm + swish instances * Ractor instance generation, split into multiple cpp file * Add external api and client example * Refine profiler message * Use ck math version of exp * Refine problem size in example * Add host version of exp
-
- 07 Apr, 2023 2 commits
-
-
ltqin authored
-
- 03 Apr, 2023 1 commit
-
-
ltqin authored
-
- 31 Mar, 2023 2 commits
-
-
ltqin authored
-
ltqin authored
Merge branch 'lib_gemm_softmax_gemm_type' of https://github.com/ROCmSoftwarePlatform/composable_kernel into lib_gemm_softmax_gemm_type
-