- 01 Oct, 2024 4 commits
-
-
Illia Silin authored
-
Illia Silin authored
* add missing vector header * Re-format header using remod.py --------- Co-authored-by:Po Yen, Chen <PoYen.Chen@amd.com>
-
Po Yen Chen authored
* Use same layout for o_acc and o tensor * Use better param names in partitioner * Remove redundant kargs 'max_seqlen_q' * Use better param names in splitkv kernel * Add comment for additional kernel arguments * Sync empty loop early return logics between pipelines * Pass more arguments to cmake in scripts * Align backslashes * Fix wrong o_acc tensor view strides * Change o_acc layout if o_perm=0 * Handle whole row masked via attn_bias * Use use vector width = 1 for o_acc * Use more even split sizes
-
M.Emin Ozturk authored
* complex type contraction * bug fix * update * Tensor Contraction Complex Data Type is working * 4D Kernel * some change * validation check in progress * validation issue * fp32 verification error is fixed * fp32 and fp64 are done * remove old files * remove cmake files * remove cmake files * Readme * img verification * CMakeList * number changed --------- Co-authored-by:
Illia Silin <98187287+illsilin@users.noreply.github.com> Co-authored-by:
Emin Ozturk <emin.ozturk@utah.edu>
-
- 27 Sep, 2024 1 commit
-
-
Bartłomiej Kocot authored
* [CK_TILE] Image to Column kernel * Fixes * Vector loads and stores * Fixes * Fixes * change test dir name
-
- 26 Sep, 2024 1 commit
-
-
Dan Yao authored
* add barriers * tail bias barriers * adjust bf16/hd256 tol * continue adjust bf16/hd256 tol
-
- 25 Sep, 2024 3 commits
-
-
Illia Silin authored
* fix clang20 compilation errors for gfx90a * fix clang20 compilation errors for gfx11 targets
-
Illia Silin authored
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 24 Sep, 2024 1 commit
-
-
BrianHarrisonAMD authored
* Add additional instances to device_mha_instance * Add comment to describe what receipt 3 option filters --------- Co-authored-by:Po Yen Chen <PoYen.Chen@amd.com>
-
- 23 Sep, 2024 1 commit
-
-
Illia Silin authored
* add an option to build CK with legacy dockers * change the custom docker settings * add environment varianble for custom docker * use a new variable for legacy docker name * new way to pass docker names for legacy OS * add legacy docker check in the Build_CK function * change groovy syntax * add a check for legacy docker in getDockerImage * make sure the legacy docker name is not empty * remove the dumb-init call * disable the tests in legacy OS dockers * disable tests in legacy dockers * use a different way to disable tests in legacy dockers * rearrange the CI stages for legacy OS * use different way to disable tests in legacy dockers * update LD_LIBRARY_PATH for legacy dockers and add cron job * update LD_LIBRARY_PATH at docker launch * change the sytax for setting LD_LIBRARY_PATH
-
- 22 Sep, 2024 1 commit
-
-
Po Yen Chen authored
-
- 20 Sep, 2024 2 commits
-
-
Bartłomiej Kocot authored
* Support NGCHW in grouped conv fwd * Remove not needed variable * Fixes
-
Adam Osewski authored
The dynamic buffer doesn't have support for fp8 in `Update` operation thus fp8 is not supporting `InMemoryDataOperation::Add`
-
- 18 Sep, 2024 2 commits
-
-
Thomas Ning authored
* Support the N dimension padding * Finished the padding feature for different dimension of K
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.0 to 1.8.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.1/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.0...v1.8.1 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 17 Sep, 2024 3 commits
-
-
Illia Silin authored
* add image for rocm6.3_rc1 * fix deb package url
-
aledudek authored
* Extend pool3d fwd avg, max operations by f8_t, int8_t types * Pack MaxPool3dFwd params together * Fix MaxPool3dFwd AVG instances * Decrease verification precision for bf16 * Adjust tests + review changes * Adjust threshold for F8 * Adjusted compute types for MAX op instances * Fix ComputeDataType mismatch in tests and profiler for AVG * Fix naming from max_pool3d_fwd to pool3d_fwd * Adjust CMakeLists --------- Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.7.2 to 1.8.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.7.2...v1.8.0 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 16 Sep, 2024 1 commit
-
-
Mateusz Ozga authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
- 14 Sep, 2024 2 commits
-
-
Thomas Ning authored
* Finished the feature of gpu verification * Add the ck_tile_gemm test in the CI CD * add the include of tensor_layou in reference_gemm * Comment Addressed * split ck_tile fhma and gemm tests into separate stages * restructure the reference gemm * restructure a new reference_gemm api that could read the device mem --------- Co-authored-by:
carlushuang <carlus.huang@amd.com> Co-authored-by:
illsilin <Illia.Silin@amd.com>
-
bibek authored
-
- 13 Sep, 2024 3 commits
-
-
jakpiase authored
* add pool2d fp8 and int8 * minor fixes * add formatting * add reviewer suggestions * add reviewer suggestions
-
dependabot[bot] authored
Bumps [sphinxcontrib-bibtex](https://github.com/mcmtroffaes/sphinxcontrib-bibtex) from 2.6.2 to 2.6.3. - [Changelog](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/blob/develop/CHANGELOG.rst) - [Commits](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/compare/2.6.2...2.6.3 ) --- updated-dependencies: - dependency-name: sphinxcontrib-bibtex dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Jun Liu authored
* Legacy support: customized filesystem * Update cmakefile for python alternative path * fix build issues * CK has no boost dependency * More fixes to issues found on legay systems * fix clang format issue * Check if blob is correctly generated in cmake * fix the python issues * add a compiler flag for codegen when using alternative python * use target_link_options instead of target_compile_options --------- Co-authored-by:illsilin <Illia.Silin@amd.com>
-
- 12 Sep, 2024 2 commits
-
-
Illia Silin authored
-
Mateusz Ozga authored
* Add pool2d instance BWD AVG * Add pool2d instance BWD MAX * Fix: avg review * Fix review: part2 * Fix - enable test when type is compiled * Fix review part3
-
- 11 Sep, 2024 2 commits
-
-
jakpiase authored
* added pool2d fwd * add tests * add reviewers changes * Revert "Merge remote-tracking branch 'origin/develop' into jakpiase/pool2d_fwd_new" This reverts commit 6b2ba7ff8960b0a6ddbe30d8dac53eeb55a8597e, reversing changes made to 22c82bea0caf3e0f29399100c1bb67b8003fc042. * Revert "add reviewers changes" This reverts commit 22c82bea0caf3e0f29399100c1bb67b8003fc042. * added reviewers comments * revert some old files * add reviewers requests --------- Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
jakpiase authored
* Implemented smfmac xdlops * Added smfmac blockwise xdlops * fixes * add reviewers suggestions --------- Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
- 10 Sep, 2024 1 commit
-
-
Dan Yao authored
* fix fa bwd * revert kernelBlockSize in gemm_kernel.hpp
-
- 09 Sep, 2024 1 commit
-
-
Thomas Ning authored
-
- 07 Sep, 2024 1 commit
-
-
Thomas Ning authored
* Checkpoint: Finished with the tile example & kernel verification, working on the different matrix layout * Finished the Matrix Layout feature set up. Note: Need to modify the inner block to solve the shuffle problem in the future. * Fix: Clang Format, API fixed from fmha * fix with better naming convention * revert back the pipeline code of fmha * Fixed: Addressed the comments and merge the GEMM shape of GEMM Operator and FMHA Operator to one. * clang format with the reference_gemm file * convert the clang format with the remod.py * Changed the format and variable name of the kernel gemm_shape and partitioner --------- Co-authored-by:thomasning <thomasning@banff-cyxtera-s70-4.ctr.dcgpu>
-
- 05 Sep, 2024 2 commits
-
-
M.Emin Ozturk authored
* issue fix, one line changed for tmp * clang --------- Co-authored-by:
Emin Ozturk <emin.ozturk@utah.edu> Co-authored-by:
Harisankar Sadasivan <135730918+hsadasiv@users.noreply.github.com>
-
Haocong WANG authored
* revert ckprofiler change * temp save * Add test and test pass * test pass * Fix bug inside rotating buffer when tensor is not packed * bug fix * clang format --------- Co-authored-by:Illia Silin <98187287+illsilin@users.noreply.github.com>
-
- 04 Sep, 2024 3 commits
-
-
Rostyslav Geyyer authored
-
Illia Silin authored
* copy all fmha headers when building library * fix the rocm_install call for mha headers
-
Illia Silin authored
* locate a newwer version of python when -DRHEL=ON flag is set * allow setting python version on cmake command line
-
- 03 Sep, 2024 1 commit
-
-
Bartłomiej Kocot authored
* Add support for NGCHW in grouped conv bwd wei * Comments fixes * navi fixes * Update function names
-
- 02 Sep, 2024 1 commit
-
-
Bartłomiej Kocot authored
Revert "Revert "Revert Revert Support access per groups and filter2x3 in grouped conv fwd (#1382) (#1406) (#1415)" (#1455)" (#1490) This reverts commit 5ff8eeeb.
-
- 30 Aug, 2024 1 commit
-
-
Dan Yao authored
* asm rtn * add asm rtn macro * reorder macro --------- Co-authored-by:carlushuang <carlus.huang@amd.com>
-