- 24 Oct, 2023 1 commit
-
-
Astha Rai authored
-
- 23 Sep, 2023 1 commit
-
-
Astha Rai authored
-
- 21 Sep, 2023 1 commit
-
-
Illia Silin authored
* refactor cmake files for the tests * refactor cmake files for examples * fix cmake for gemm example * fix the cmake file for all examples * add splitting by data types in gemm_splitk instance header * rename test to reflect only dl instances are used * clean up CI workspace, update cmake for instances * change the jenkinsfile syntax * build all instances except DL on gfx11 * move workspace cleanup after stages * clean up workspace after every stage * isolate data types in grouped_conv_fwd header * isolate dl instances for grouped_conv2d_fwd * fix syntax * fix cmake and batchnorm instances * fix typo * fix reduction instances * fix grouped_conv headers * fix syntax * replace parsing logic for instances, replace bfp16 with bf16 * fix the client examples build * clean up DTYPES from instances cmake files * update the parsing logic in cmake files * make an exception for reduction kernels * update few remaining cmake files to handle DTYPES * fix syntax * fix cmake conflicts * replace f8 with fp8 test name * resolve conflicts for dpp instances
-
- 20 Sep, 2023 1 commit
-
-
Astha Rai authored
-
- 06 Sep, 2023 1 commit
-
-
Astha Rai authored
-
- 29 Aug, 2023 1 commit
-
-
Astha Rai authored
-
- 07 Aug, 2023 1 commit
-
-
Illia Silin authored
* properly split conv_nd_bwd_data instances * split conv2d_fwd instance data types * split the gemm, conv2d_fwd and batched_gemm_softamx_gemm * split the tests by data types where possible * filter examples by DTYPES * split few remaining examples by DTYPES * filter most instances by DTYPES * add new lines at end of headers, fix grouped_gemm profiler * fix syntax * split the ckprofiler instances by DTYPES * split the conv2d and quantization DL and XDL instances * fix the splitting of conv2d DL instances * split softmax and pool_fwd tests for fp16 and fp32 types * fix syntax * fix the dl_int8 quantization instances isolation
-
- 12 Dec, 2022 1 commit
-
-
arai713 authored
* added 2d gridwise elementwise * added 2d version of device elementwise * added example file with updated device elementwise call * added Cmake file * changed NumDim into 2D * fixed compiler issues * fixed indexing for loop step * fixed NumDim dimension error * changed blockID to 2D * updated Grid Desc * updated kernel call * fixed 2d thread indexing * added dimensions for example file * commented out unused code * changed vector load * removed extra code * temporarily removing vector load on 2nd dim * changed vector load back, still causing errors * altered indexing * changed isSupportedArgument for 2D * changed indexing + do/while * fixed isSupportedArgument * changed dimension for debugging * fixed * added testing printouts * testing change * added variables to distribute threads through both dimensions * testing changes * integrated variable for thread distribution into device elementwise and added as parameter for gridwise elementwise * removed most of the extraneous code, testing with different dimensions * testing * removed debugging print statements * moved 2d elementwise permute into elementwise permute directory * fixed formatting * removed debugging comments from threadwise transfer Co-authored-by:
Jing Zhang <jizhan@amd.com> Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com>
-
- 19 Oct, 2022 1 commit
-
-
arai713 authored
-
- 17 Oct, 2022 1 commit
-
-
arai713 authored
* adding tensor_permutation example folder * fixed formatting * adding tensor_permutation example folder * fixed formatting * changed deviceelementwise parameters for outscalar * removed .swo file * updated folder/file name * changed function call in verification for better consistency with hostelementwist parameters * formatted again * fixed shape in verification function call * changed verification function call, added definition for nhwc * added elementwise permute example * updated CMakeLists file in folder * Delete CmakeLists.txt * Delete tensor_permute.cpp * first version of 2d gridwise_elementwise kernel * temporary fix for stride problem * formatting * format * changed directory name * Delete gridwise_elementwise_2d.hpp * Delete CMakeLists.txt * Delete extra file * delete extra file * got rid of extraneous code * added 2d device elementwise file * deleted accidently added file * update * stride values generalized with equations * updated stride for output matrix * Update CMakeLists.txt * removed extraneous commented code * removed shape_nchw vector, replaced with GetLength for each dimension * changed vector load in kernel call * removed extra space in CMake
-