- 07 Aug, 2024 2 commits
-
-
Illia Silin authored
-
Illia Silin authored
* run ck_tile benchmarks after the smoke tests and store logs * change the path of fmha benchmark logs * change the way of stashig ck_tile fmha logs * prevent the errors in stages where no logs are generated * fix the ck_tile fmha log names and headers * generate the fmha performance logs in the root folder * change jenkins scrip arguments format * use exact file names for stashing * modify scripts to process FMHA performance results * unstash FMHA logs before parsing them
-
- 05 Aug, 2024 1 commit
-
-
Illia Silin authored
-
- 01 Aug, 2024 1 commit
-
-
Illia Silin authored
* add compiler flags to fix compiler issues * fix typo. * disable test_smfmac_op on all devices except gfx942 * specify full path to compiler in CI
-
- 11 Jul, 2024 2 commits
-
-
Illia Silin authored
* add ck_tile tests to CI * build and run ck_tile tests on gfx90a and gfx942 in parallel * fix groovy syntax * turn ck_tile tests OFF by default * skip creating the build folder * build ck_tile examples with 64 threads * build ck_tile examples with cmake-ck-dev.sh script * add video group to docker on mi300 * do not retry to rebuild the early CI stages * help prevent jenkins false failure * restore cron trigger
-
Illia Silin authored
* test the cron trigger * fix the cron jobs * restore the list of cron jobs
-
- 04 Jul, 2024 1 commit
-
-
Jun Liu authored
-
- 27 Jun, 2024 1 commit
-
-
Illia Silin authored
-
- 03 Jun, 2024 1 commit
-
-
Illia Silin authored
-
- 28 May, 2024 1 commit
-
-
Illia Silin authored
* test library build for all supported targets * increase the number of threads to build lib in CI to 64
-
- 10 May, 2024 1 commit
-
-
Illia Silin authored
* code clean-up * remove the profiling output samples
-
- 01 May, 2024 1 commit
-
-
Illia Silin authored
-
- 30 Apr, 2024 1 commit
-
-
Illia Silin authored
* add a daily build for instances for gfx9;gfx10;gfx11 * fix jenkins logic for instances only build * fix the path for instance_only build * reduce the number of build threads to 32
-
- 18 Apr, 2024 2 commits
-
-
Illia Silin authored
* add rocm6.1 docker and make it default for CI * fix typo * move the rocm6.1 image into public dockerhub repo * upgrade daily cron jobs to use rocm6.1
-
Illia Silin authored
* add rocm6.1 docker and make it default for CI * fix typo * move the rocm6.1 image into public dockerhub repo
-
- 14 Apr, 2024 1 commit
-
-
Haocong WANG authored
* Optimize GEMM on MI200/300: 1. Add new blockwise gemm pipeline 2. Add irregular splitk intances * clang format + typo fix * Fix a bug * initial commit * Add more instances to irregular splitk * blkgemm pipeline v1~4 prototype * Sanity Checked. Known issue: 1. Poor performance of splitk 2. Register spill on blkgemmpipeline v3 * Sanity and Performance fix: 1. fix a bug related to sanity in grouped b2c mapping 2. fix a bug related to sanity and performance in splitk offset * Sanity and API update: 1. Remove prefetch stage 2. Fix valid check bug 3, Add first gemm_universal instance into ckProfiler * Add NN instances for gemm universal * 1. Add NT instances for gemm_universal 2. Fix a bug about Kpadding in gemm_universal * Fix a bug regarding padding Odd K number * remove kernel print * Fix KPadding bug... * Update safety check * another try to fix kpadding.. * Sanity checked * new instances.. * clang format+typo fix * remove clang format script's change * Add non-hotloop compile option * 1. Add fp16xfp8 example 2. pull packed convert f8 from pr1150 * Some miscs.. opt and fix * Add pipeline description docs * Split universal gemm instance library to cut profiler compiling time * uncomment cmakefile * Fix a bug caused by blockwise_gemm_pipe_v2 * reduce default splitk to 1 * Add 224x256x64 tile size * update, including: 1. Experiment pipeline 5~7 2. Optimization for pipeline 4 3. Organized instance library * temp save * temp save * Permuted lds layout, sanity and function checked * clang format * Move OOB check from RunRead to RunWrite, for better software pipeline. TODO: agpr spill when NN layout * clangformat * A/B splitpipe scheduler for v3 * Fix two bugs * bug fix * fix a bug in oob check * Example for mixed fp16_fp8 gemm * Clean experimental code blocks * Add mixed precision gemm into profiler * tempsave * optimize m/n major lds layout * Add RRR GEMM mixed precision instances * Optimize f8 matrix transpose * Add test_gemm_universal * A/B spilt schedule for blkpip v5 * Take ds_read2 into iglp scheduling scheme * format * fixed cmake * Add llvm-option into CI cmake flag --------- Co-authored-by:Jing Zhang <jizhan@amd.com>
-
- 22 Mar, 2024 1 commit
-
-
Illia Silin authored
-
- 19 Mar, 2024 1 commit
-
-
Illia Silin authored
* do not install sccache by default, only install rocm-llvm-dev for rocm6.1 * add sccache flag to docker build options
-
- 18 Mar, 2024 1 commit
-
-
Illia Silin authored
* test CK with rocm6.1 RC2 * add docker credentials for pull * update the performance db name * use environment variable for db name * add rocm-llvm-dev package to ck docker * turn off verification for daily performance runs * do not stash ckProfiler on MI300 node * add processing of mixed gemms to qa, fix parsing of splitk gemm logs * fix the splitk gemm log file name * turn the timing on for splitk gemm performance
-
- 06 Mar, 2024 1 commit
-
-
Paul Fultz II authored
* Format * Format * Format * Remove const * Use the right template * Format * Format * add row/col instances * Add missing file * fixed * Format * Updates * Format * fixed rrr layout * Format * Update test and embed modules * Restore older version * Update year * Set -fPIC * Format * Use double for isnan * rename host folder to codegen + minor fix * add codegen CI test * add option to build components without building CK * fix the groovy syntax * fix typo * use the correct function for the codegen stage --------- Co-authored-by:
Jing Zhang <jizha@amd.com> Co-authored-by:
Illia Silin <98187287+illsilin@users.noreply.github.com> Co-authored-by:
illsilin <Illia.Silin@amd.com>
-
- 05 Mar, 2024 1 commit
-
-
Illia Silin authored
-
- 13 Feb, 2024 1 commit
-
-
Illia Silin authored
-
- 05 Feb, 2024 1 commit
-
-
Illia Silin authored
* delete dangling docker images * fix groovy syntax * fix groovy syntax again * try a different way to delete dangling images
-
- 30 Jan, 2024 2 commits
-
-
Illia Silin authored
-
Illia Silin authored
-
- 26 Jan, 2024 1 commit
-
-
Illia Silin authored
-
- 24 Jan, 2024 1 commit
-
-
Illia Silin authored
* fix cppcheck errors, first pass * fix format * fix returned value in examples * add macro definitions for cppcheck * fix the profile_gemm logic * update the gemm profiler logic * add more difinitions to cppcheck, fix couple more errors * replace runtime error with message in device function * fix a couple of int4 issues * no return for fill function * fix errors in data_types.hpp * fix format * fix few remaining errors * fix errors in data_types.hpp * fix last couple of errors in datat_types.hpp
-
- 15 Jan, 2024 1 commit
-
-
Illia Silin authored
* add cppcheck to the CK CI * fix the path to CK source for cppcheck * fix the path to CK source for cppcheck one more time * fix the path to CK source for cppcheck third time * change the path to ck_cppcheck.log * install latest cppcheck from source * fix bug in ck.hpp and use 20 threads for cppcheck * create a switch to turn cppckeck on and off in CI
-
- 05 Jan, 2024 1 commit
-
-
Illia Silin authored
* add docker for rocm6.0.1 rc1 * modify the path to clang for test compilers in CI * fix the hipcc/clang path for test compilers in CI * fix the dockerfile for older rocm versions
-
- 16 Dec, 2023 1 commit
-
-
Illia Silin authored
* upgrade to rocm6.0 compiler * move rocm6.0 from private to public repo * switch to testing hipTensor mainline in CI
-
- 07 Dec, 2023 1 commit
-
-
Illia Silin authored
* switch from ROCmSoftwarePlatform to ROCm org * replace ROCmSoftwarePlatform with ROCm in few more places
-
- 06 Dec, 2023 1 commit
-
-
Illia Silin authored
* turn on -O3 compiler flag explicitly * change cmake syntax for CI * modify cmake line breaks in jenkinsfile
-
- 05 Dec, 2023 1 commit
-
-
Illia Silin authored
* add daily build with mainline compiler * fix the compiler paths for ci * remove the -flto flag * build with clang by default
-
- 30 Nov, 2023 1 commit
-
-
Jun Liu authored
-
- 09 Nov, 2023 1 commit
-
-
Illia Silin authored
-
- 03 Nov, 2023 1 commit
-
-
Illia Silin authored
-
- 01 Nov, 2023 1 commit
-
-
Illia Silin authored
-
- 30 Oct, 2023 1 commit
-
-
Illia Silin authored
* replace ccache with sccache, pin package versions * put ccache back temporarily to avoid breaking other CI jobs * add sccashe_wrapper.sh script * fix the package version syntax * fix the pymysql package issue * run sccache_wrapper before build if ccache server found * set the paths before calling the sccache_wrapper * use /tmp instead of /usr/local for cache * try using sccache --start-server instead of wrapper * try using redis server with sccache * define SCCACHE_REDIS * add redis and ping packages, and redis port * use the new sccache redis server * do not use sccache with staging compiler * fix the condition syntax * add stunnel to redis * add tunnel verification * separate caches for different architectures * fix syntax for the cache tag * quse double brackets for conditions * add bash line to the script * add a switch for sccache and only use it in build stage * run check_host function when enabling sccache * fix the invocation tags for sccache * fix groovy syntax * set the invocation tag in groovy * disable sccache in clang-format stage * try another syntax for invocation tags * use local sccache server if can't connect to redis * fix script syntax * update README * refresh readme * readme updates * remove the timing and verification caveat from readme --------- Co-authored-by:Lisa Delaney <lisa.delaney@amd.com>
-
- 19 Oct, 2023 1 commit
-
-
Illia Silin authored
* apply the patch for dl kernels on gfx11 * build DL kernels on navi32 CI
-
- 16 Oct, 2023 1 commit
-
-
Illia Silin authored
* add a hipTensor test to CI * use jenkins git plugin * change hipTensor folder location in CI * change the git method for hipTensor * run tests usign ctest * check the hipTensor contents * only build hipTensor on MI100/200 * pull hipTensor as zip archive * fix jenkins syntax * add path to the CK installation * combine build commands into one shell * change jenkins syntax for CK installer path * try different syntax * allow unzip overwrite * fix jenkins file syntax * remove any old versions of hipTensor before building * add option to select hipTensor branch for testing
-