- 25 Jul, 2023 13 commits
-
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
ltqin authored
* first change bias load * add bias dim and scalervector parameter * make CDE0BlockTransferSrcVectorDim not work * changse toinstance * add limit for CDE0BlockTransferSrcScalarPerVector
-
- 21 Jul, 2023 4 commits
-
-
Illia Silin authored
-
Illia Silin authored
-
rocking authored
-
Bartłomiej Kocot authored
-
- 20 Jul, 2023 2 commits
- 18 Jul, 2023 3 commits
-
-
Bartłomiej Kocot authored
* Grouped 3d conv backward data support * Fix comments
-
Rostyslav Geyyer authored
-
Illia Silin authored
* allow building CK for specific data types * add CI build and test stage on Naiv3x without some int8 instances * add missing gemm fp16 instances * add the changes to the missed cmake file * add empty lines at end of source files * Do not build quantization client example on navi3 in CI * disable batched_gemm_multi_d_int8 instances with DTYPES * disable device_conv2d_bwd_data_instance with DTYPES * fix ckprofiler for conv_bwd_data for int8 * properly isolate the conv_bwd_data int8 instances * remove empty line
-
- 17 Jul, 2023 2 commits
-
-
Illia Silin authored
* check if gpu_targets are supported by compiler * set default list of targets and filter for them
-
rocking authored
-
- 15 Jul, 2023 1 commit
-
-
arvindcheru authored
* Disable Werror to ignore xnack+ warnings
-
- 14 Jul, 2023 1 commit
-
-
rocking authored
-
- 12 Jul, 2023 1 commit
-
-
Bartłomiej Kocot authored
* Support NHWGC conv2d_bwd_weight * Fix client example * Fix client example * Fix comments * Redesign grouped_conv_bwd_weight instances * Clang format fix --------- Co-authored-by:zjing14 <zhangjing14@gmail.com>
-
- 10 Jul, 2023 2 commits
- 07 Jul, 2023 3 commits
-
-
rocking authored
-
rocking authored
-
Illia Silin authored
-
- 06 Jul, 2023 8 commits
-
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
rocking authored
-
Adam Osewski authored
* Add basic setup for precommit * Update README.md with instructions on installing precommit hooks --------- Co-authored-by:
Illia Silin <98187287+illsilin@users.noreply.github.com> Co-authored-by:
Bartlomiej Wroblewski <bwroblewski10@gmail.com>
-
Po Yen Chen authored
* Move source file into sub-directories * Add missing include directive * Split DeviceGemmXdl<> fp16 instances * Fix format * Remove unnecessary CMakeLists.txt * Add macros to toggle new features * Remove debug message * Turn off GEMM v2 pipeline optimization by default * Fix format * Extract duplicated string as list * Enlarge indent in CMakeLists.txt
-
Qianfeng authored
* Use dim 0 as faster dim for writing mean/var/count workspace in batchnorm multiblock method [performance] * Add CountDataType as template parameter in blockwise_welford * Add utility/get_shift.hpp * Add BatchNorm multiblock single-kernel implementation * Add smem inline assembly based implementation of gms_init/gms_barrier/gms_reset for gfx90a * Renaming in device_batchnorm_forward_impl.hpp * Tiny fix in the batchnorm_fwd profiler * Revert "Add smem inline assembly based implementation of gms_init/gms_barrier/gms_reset for gfx90a" This reverts commit d16d00919c43f10759e7b4e4d112125221ed9064. * Use the old two-kernel batchnorm multiblock method for gfx1030 * Use the old two-kernel batchnorm multiblock method for gfx908 * use the single-kernel batchnorm multiblock method only for gfx90a * Remove get_wave_id() from utility/get_id.hpp since it is not used * Set true for testing running mean/variance and saving mean/invvariance in the examples * Fix to copy-right words * Remove un-needed including in utility/get_id.hpp * Add comments to workgroup_synchronization.hpp * Remove un-used codes in gridwise_multiblock_batchnorm_forward.hpp * Renaming in the kernels * Remove un-used kernel file
-