- 11 Aug, 2022 19 commits
-
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
Anthony Chang authored
-
- 10 Aug, 2022 1 commit
-
-
zjing14 authored
* convnd_fwd fp16 example * update example * update example * update instance * updating refernce conv * update reference conv * update conv fwd profiler * update conv 1d and 3d instance * update include path * clean * update profiler for conv bwd data and weight * update conv bwd weight * clean * update conv example * update profiler for conv bwd weight * update ckprofiler for conv bwd data * fix reference conv bwd data bug; update conv bwd data test * update examples * fix initialization issue * update test for conv fwd * clean * clean * remove test case too sensitive to error threshhold * fix test * clean * fix build * adding conv multiple d * adding conv multiple D * add matrix padder * add gemm padding to convnd * adding group conv * update gemm multi-d * refactor * refactor * refactor * clean * clean * refactor * refactor * reorg * add ds * add bias * clean * add G * adding group * adding group * adding group * update Tensor * clean * update example * update DeviceGemmMultipleD_Xdl_CShuffle * update conv bwd-data and bwd-weight * upate contraction example * update gemm and batch gemm with e permute * fix example build * instance for grouped conv1d * update example * adding group conv instance * update gemm bilinear instance * update gemm+add+add+fastgelu instance * update profiler * update profiler * update test * update test and client example * clean * add grouped conv into profiler * update profiler * clean * add test grouped conv, update all conv test to gtest * update test * change gemm_c_permute with contraction * add grouped_contraction * add contraction in group_gemm * add example of grouped_gemm with contraction * add example of grouped_contraction_bias_e_permute * clean * fixed ds * add m3n2 m2n3 examples into gemm_bias_e_permute Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
- 08 Aug, 2022 1 commit
-
-
Illia Silin authored
* allow selecting compiler version * fix typo * add Wno-deprecated flag for google tests * change git repo, fix qa log files names * change the git clone syntax * use Omkar's git credentials * try to use jenkins as git user * try using illsilin username for gerrit repo with ssh key * try new gerrit authorization * change ssh key syntax * try another way of passing ssh key to docker * add mount ssh in dockerfile * create .ssh folder * move ssh-keyscan to later * get rid of npm call * build first docker image on master * check the contents of the .ssh folder * try replacing omkars creds with gerrit creds * use open repo, clean up changes * get rid of ssh default argument
-
- 07 Aug, 2022 1 commit
-
-
Chao Liu authored
-
- 03 Aug, 2022 1 commit
-
-
Chao Liu authored
* add conv oddC * update example * update example * fix bug in example * fix bug in group conv example
-
- 02 Aug, 2022 2 commits
-
-
Adam Osewski authored
* Add int8 specialization for elementwise Add and Subtract. * CGEMM examples bf16, fp32, int8 * Add convert reference output to CDataType. * Skip BF16 data type during testing. * Lower K value to get rid of accumulation error. * Fix merge artifact. * Fix changed function name: GetElementSpaceSize() * Fix merge artifact. Co-authored-by:Adam Osewski <aosewski@amd.com>
-
Illia Silin authored
* turn on full qa only on gfx90a, use int initialization * change script syntax * update script parsing clinfo, throw exception if 0 devices * fix syntax * try using toBoolean for the QA conditions * run regular CI on MI100 only, use MI200 only for daily QA * evaluate when conditions before agent * launch QA on develop branch and update profile_reduce script * update test script * update script * remove false dependency from dockerfile * try removing rbuild completely Co-authored-by:
Chao Liu <chao.liu2@amd.com> Co-authored-by:
Chao Liu <lc.roy86@gmail.com>
-
- 29 Jul, 2022 1 commit
-
-
Chao Liu authored
* convnd_fwd fp16 example * update example * update example * update instance * updating refernce conv * update reference conv * update conv fwd profiler * update conv 1d and 3d instance * update include path * clean * update profiler for conv bwd data and weight * update conv bwd weight * clean * update conv example * update profiler for conv bwd weight * update ckprofiler for conv bwd data * fix reference conv bwd data bug; update conv bwd data test * update examples * fix initialization issue * update test for conv fwd * clean * clean * remove test case too sensitive to error threshhold * fix test * clean * fix build * adding conv multiple d * adding conv multiple D * add matrix padder * add gemm padding to convnd * adding group conv * update gemm multi-d * refactor * refactor * refactor * clean * clean * refactor * refactor * reorg * add ds * add bias * clean * add G * adding group * adding group * adding group * update Tensor * clean * update example * update DeviceGemmMultipleD_Xdl_CShuffle * update conv bwd-data and bwd-weight * upate contraction example * update gemm and batch gemm with e permute * fix example build * instance for grouped conv1d * update example * adding group conv instance * update gemm bilinear instance * update gemm+add+add+fastgelu instance * update profiler * update profiler * update test * update test and client example * clean * add grouped conv into profiler * update profiler * clean * add test grouped conv, update all conv test to gtest * update test
-
- 22 Jul, 2022 2 commits
-
-
Illia Silin authored
-
zjing14 authored
* add batched_gemm_multiD * add ds * rename file * add batched_gemm_bias example * add batch_strides into bmm_c_permute * clean * rename example_28 to example_29 Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
- 21 Jul, 2022 2 commits
-
-
Illia Silin authored
* add verify flag and update scripts * replace old check_error function with the new check_err * fix syntax * remove blank spaces * remove empty line * add check_err for tensors * fix syntax * replace tensors with vectors in check_err calls * fix syntax * remove blank spaces * fix syntax * add new line at end of file * disable conv2d_bwd_weight test, add gpu check * set check_gpu using export * check GPU using runShell * add definition of runShell * fix script syntax * reduce the number of threads, add full qa option * run processing scripts in bash * fix the branch and host names in performance scripts, add chronos * replace parameterizedCron with cron * archive the perf log files * try to fix git call * pass branch and host names as arguments into scripts * fix script arguments * fix script arguments * process results on master * fix pipeline * add definition of gpu_arch * run processing scripts in docker * fix the brackets * add agent master for the processing stage * get rid of show_node_info call on master * try using mici label instead of master, disable MI100 tests for now * fix syntax * simplify container for results processing * remove node(master) from the process_results stage * put all stages in original order * change the agent label from master to mici for gfx908
-
zjing14 authored
* replace gridwise_v2r3 with multiD * adjust parameters * add instances * fixed test_grouped_gemm * fix standalone softmax race condition around blockwise reduction * fixed ci * fixed comment: remove redundant workspace * use instanceFactory * add test layout * add empty Ds * add bias example * use array * sperate examples Co-authored-by:Anthony Chang <ac.chang@outlook.com>
-
- 15 Jul, 2022 1 commit
-
-
Anthony Chang authored
-
- 13 Jul, 2022 3 commits
-
-
rocking5566 authored
* Implement layernorm kernel and deviceOp * verify gpu kernel with host code * 1. Separate gamma aand beta from affine 2. Check if argument is valid * clean * Sync the naming * Support sweep once mode if we can put k dimension data inside one block * [What] Get length from upper length. [Why] if we get length directly, we may get length after padding. * We only use one block in K dimension. Hence, we can simplify the indexing of global R/W. * Use 1d descriptor for gamma and beta * Add accElementwiseOp * Extract layernorm host code * Support different YVectorDim in GridwiseLayernorm * Rename XSrcVectorDim to XYSrcVectorDim. Because we use same parameter in deviceOp * Gamma and beta can share the VGPR. * Add test for fp32 and fp16 * Fix bug of concurrency and add test case which may fail orignally * Propagate NaN for layernorm Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
Daming Feng authored
-
Illia Silin authored
* adding scripts for full perf test suite * uncomment the sql queries * fix typo and chmod a+x for scripts * dos2unix for all new scripts * disable verification in full performance test * fix reduction scripts, add gfrouped_gemm hotfix * fix the grouped_gemm hotfix and only run reduction for fp16 * change compiler flag syntax * fix syntax * add predefinition of dockerArgs * avoid redefinitions of dockerArgs * add blank space at the end of dockerArgs * try to build with release compiler * adding spaces inside if condition * limit the number of threads for building 9110 compiler * change the way HIP_CLANG_PATH is set * remove the export command * change the conditional ENV syntax * set HIP_CLANG_PATH at docker run time * update scripts for full qa * enable the sql write query * fix typo * remove a comment from a script
-
- 08 Jul, 2022 2 commits
-
-
Po Yen Chen authored
* format * improving pipeline * fix typo * format * adding thread group * adding thread group * adding thread group * adding gemm pipeline * tweak * refactor * refactor * add missing type convert * refactor * refactor * refactor * clean * fix build * refactor * format * clean up * use remove_cvref_t * clean * use pipeline_v2 for gemm kernel * Remove inconsistent indent * Fix compilation errors due to incomplete merge process * Add missing include directives * Fix compilation errors in currently unused files * Add license in newly added files * Re-format touched files by clang-format-10 * Fix wrong template argument count of DeviceGemm<> * Use language construct to choose between types * Use language construct to choose GEMM example instance * Fix compilation error due to interface change * Re-use type alias to avoid duplication * Unify type alias usage in source file * Only use v2 pipeline in one gridwise GEMM type * Remove no-longer used include directives * Add static_assert() to check pipeline type requirements * Revert "Add static_assert() to check pipeline type requirements" This reverts commit f0985f0a132671a1caaea92810c9f30dcf062bde. * clean * clean * clean * clean Co-authored-by:
Chao Liu <chao.liu2@amd.com> Co-authored-by:
shaojiewang <wsjmessi@163.com>
-
Shaojie WANG authored
* add conv1d/3d bwd weight instances * add profiler code
-
- 07 Jul, 2022 1 commit
-
-
Chao Liu authored
* adding contraction * add contraction example * update examle * update example * format * update readme * clean header * clean header * contraction with multiple D * rename * fix naming issue; add instances for contraction+bilinear * change assumed virtual layout of contraction; add client example * update example * update * contraction+scale * use type_convert * rename
-
- 06 Jul, 2022 1 commit
-
-
zjing14 authored
* init commit * add c_permute * add mnk padding * fixed comments * Fixed comments Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
- 02 Jul, 2022 1 commit
-
-
Chao Liu authored
* refactor * update example * update example * gemm bilinear * clean * update
-
- 01 Jul, 2022 1 commit
-
-
guangzlu authored
* modified grouped gemm addressing method * modified addressing method in device_grouped_gemm_xdl.hpp Co-authored-by:
root <root@dc-smc-13.amd.com> Co-authored-by:
Chao Liu <chao.liu2@amd.com>
-