- 18 Jun, 2022 5 commits
- 17 Jun, 2022 5 commits
-
-
Illia Silin authored
* use pre-built docker instead of building a new one * try docker.image.pull * change syntax in docker.image() * add 30 min timeout * increase timeout to 3 hours * move performance tests to first stage for testing * set image variable to the new container name * update image name * check available images * check available images in both places * try different image name * use image ID to refer to image * run performance on gfx90a * fix the gpu_arch labeling, add parameter * move env vars out of stages * add stand-alone performance script, MI200 tests, CU numbers * dos2unix for run_perf_tests.sh * try the new git credentials * use env var for git credentials * don't look up /sys/module/amdgpu/version Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
Qianfeng authored
* Remove template from Reducton operation classes and add template to their operator() and GetIdentityValue() interfaces * Change to unary elementwise operators and the reduce_unary_operator (class for mapping) and dependent variations in all host layers * Remove the data type template parameter from reduce_binary_operator (class for mapping) and dependent variations in host layers * Add InMemoryDataOperatonSupportedOnDataType to check the matching between data type and InMemoryDataOperation * Use struct-scope operator template instantiation for binary and unary element-wise operations * Change a few more elementwise operations to use template for operator() * Tiny correction in Normalize operator * Add static_assert to check the data type appliability for some reduction accumulator and element-wise operatons * Correction in some examples with regard to using ReduceAccDataType * Use static_assert for UnaryDivide * Update to merged codes to use Element-wise operations and Reduction Accumulator operations correctly * Tiny fix with regard to SetWorkSpacePointer()
-
Shaojie WANG authored
-
ltqin authored
-
rocking5566 authored
* Copy "gemm reduce" to "gemm bias add reduce" * Implement gemm bias add reduction * Fix compiler error due to merge from develop * Add tensor operation for gemm + bias + add + reduce * Add gemm_bais_add_reduce to ckProfiler * Add c1 functor * Refine type * Use reduceAccDataType instead of explicitly float * Change to use check_err() * Do relu in float32 instead of bhalf_t. Because bhalf_t is unsigned * Refactor relu. using type_trait instead of overloading * Rename DxsReduceAccElementwiseOperation to DxsReduceAccElementwiseOperation * Fix denominator * Refine nameing * Fix denominator in host * Remove useless include header * Use AccDataType * Fix static_cast order * Refine type * [What] Remove tuple type in the base class [Why] External api depend on base class. if base class has relationship with type, we will need many class for different type
-
- 16 Jun, 2022 3 commits
-
-
Shaojie WANG authored
* add GetWorkSpaceSize to base arg and make an example on convnd_bwd_weight * add bwd weight for bf16: init * remove redundant compute * use datatype and split k to check whether a workspace is used * remove unused computation for work space size * add some code for bfp16 * add device/grid unary op * add unary type convert to bwd-weight example * support bf16 splitk kernel for convnd bwd weight * 1. remove comments. 2. add checkvalidity. 3. add gridsize computation * add workspace size check * fix format * change function name
-
Chao Liu authored
-
Illia Silin authored
* use pre-built docker instead of building a new one * try docker.image.pull * change syntax in docker.image() * add 30 min timeout * increase timeout to 3 hours * move performance tests to first stage for testing * set image variable to the new container name * update image name * check available images * check available images in both places * try different image name * use image ID to refer to image * run performance on gfx90a * fix the gpu_arch labeling, add parameter * move env vars out of stages * add stand-alone performance script, MI200 tests, CU numbers * dos2unix for run_perf_tests.sh * try the new git credentials * use env var for git credentials
-
- 15 Jun, 2022 7 commits
- 14 Jun, 2022 5 commits
- 13 Jun, 2022 7 commits
- 11 Jun, 2022 2 commits
- 10 Jun, 2022 1 commit
-
-
Illia Silin authored
* use pre-built docker instead of building a new one * try docker.image.pull * change syntax in docker.image() * add 30 min timeout * increase timeout to 3 hours * move performance tests to first stage for testing * set image variable to the new container name * update image name * check available images * check available images in both places * try different image name * use image ID to refer to image * run performance on gfx90a * fix the gpu_arch labeling, add parameter * move env vars out of stages * add stand-alone performance script, MI200 tests, CU numbers
-
- 09 Jun, 2022 2 commits
- 06 Jun, 2022 1 commit
-
-
Chao Liu authored
-
- 02 Jun, 2022 2 commits
-
-
Illia Silin authored
* add resnet50 test to performance tests * add blanks before gpu_arch in log files * add resnet50 test with N=4 and process its results * add ROCM and HIP versions to test tables * uncomment the sql queries * fix script syntax in jenkinsfile
-
Shaojie WANG authored
-