- 20 Jun, 2024 2 commits
-
-
ThruptiRajLakshmanaGowda authored
* Initial Push * First Push * Fixed Clang format * Resolve merge conflict * Addressed review comments * Addressed review comments * Addressed review comments
-
Qianfeng authored
-
- 19 Jun, 2024 2 commits
-
-
zjing14 authored
-
Qianfeng authored
* Add NullBlockDropout to be used when kHasDropout is false * Change to BlockDropout::Run() for forward to reduce conditional checkings * Re-format files --------- Co-authored-by:PoYen, Chen <PoYen.Chen@amd.com>
-
- 18 Jun, 2024 3 commits
-
-
Bartłomiej Kocot authored
-
jakpiase authored
* switch to universal gemm in grouped gemm tile loop * minor fixes * add reviewers comments --------- Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
Bartłomiej Kocot authored
* Fix continous dim selection in contraction * Fixes
-
- 17 Jun, 2024 2 commits
-
-
carlushuang authored
* [CK_TILE][FA] using pk f16_f32 * correct a error
-
zjing14 authored
-
- 14 Jun, 2024 1 commit
-
-
Bartłomiej Kocot authored
* Support large tensors in grouped conv fwd * Multi ABD fixes * Fix calculate element space size
-
- 13 Jun, 2024 1 commit
-
-
Qianfeng authored
* Add insert_dummy_dep_per_dword over-loading for length 64 * Fix insert_dummy_dep_per_dword and remove over-loading for length 64 * Remove blank lines --------- Co-authored-by:Po Yen Chen <PoYen.Chen@amd.com>
-
- 12 Jun, 2024 1 commit
-
-
Rostyslav Geyyer authored
* Add fp8 bf8 conv example * Add instances * Add client example * Add random scale values * Format
-
- 11 Jun, 2024 1 commit
-
-
Bartłomiej Kocot authored
-
- 10 Jun, 2024 1 commit
-
-
Rostyslav Geyyer authored
* Update the element op * Add an example * Add instances * Add a client example * make sure new instances only build on gfx9 * Update element op and its handling * Format * Update instances to take element op as an argument * Update examples to use random scale values * Format * Update client example with random scales * Format --------- Co-authored-by:illsilin <Illia.Silin@amd.com>
-
- 07 Jun, 2024 1 commit
-
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.3.0 to 1.4.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.3.0...v1.4.0 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 05 Jun, 2024 3 commits
-
-
Bartłomiej Kocot authored
* Integrate universal gemm with conv fwd * Fix conv fwd wmma test * Fix instances * Remove direct load check
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.2.1 to 1.3.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.2.1...v1.3.0 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Rostyslav Geyyer authored
* Add a scale op * Update the element op * Add instances * Add an example * Add a client example * Add a flag check * Revert flag check addition * Fix flag check * Update d strides in example * Update d strides in client example * Apply suggestions from code review Update copyright header Co-authored-by:
Bartłomiej Kocot <barkocot@amd.com> * Move the example * Move the client example * Update element op * Update example with the new element op * Add scalar layout * Update example * Update kernel for scalar Ds * Revert kernel changes * Update element op * Update example to use scales' pointers * Format * Update instances * Update client example * Move element op to unary elements * Update element op to work with values instead of pointers * Update instances to take element op as an argument * Update examples to use random scale values --------- Co-authored-by:
Bartłomiej Kocot <barkocot@amd.com>
-
- 04 Jun, 2024 2 commits
-
-
Dan Yao authored
* FA fwd dropout * FA bwd * epilogue reuse * CMakeLists update * [CK_TILE] support alibi (#1269) * add alibi support * fix code * update code based on comment * Support more hdim * fix fp8 bias * support seqlen_k=0 case * remove unused printf * fix format --------- Co-authored-by:
rocking <ChunYu.Lai@amd.com> * now fwd/bwd can build * bwd alibi * add bwd validation stream_config * update generated filenames * update bwd kernel launch * CK_TILE_HOST_DEVICE in philox * Transpose -> transpose * format * format * format * Generate the instance for FA required * format * fix error in WarpGemm --------- Co-authored-by: danyao12 <danyao12> Co-authored-by:
carlushuang <carlus.huang@amd.com> Co-authored-by:
rocking <ChunYu.Lai@amd.com> Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com> Co-authored-by:
Jing Zhang <jizhan@amd.com>
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.2.0 to 1.2.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.2.0...v1.2.1 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 03 Jun, 2024 1 commit
-
-
Illia Silin authored
-
- 01 Jun, 2024 1 commit
-
-
zjing14 authored
* add f8 gemm with multiD for both row/col wise * change compute_type to fp8 * changed tuning parameters in the example * add rcr example * post-merge fix * fix * reduce init range
-
- 28 May, 2024 4 commits
-
-
Illia Silin authored
* test library build for all supported targets * increase the number of threads to build lib in CI to 64
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.3 to 1.2.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.3...v1.2.0 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
zjing14 authored
* add f8 gemm with multiD for both row/col wise * change compute_type to fp8 * changed tuning parameters in the example * add rcr example
-
carlushuang authored
* support cmdline seqlen decode * silent print * update readme * update kernel launch 3d * update tile partitioner * fix spill for bf16 * modify based on comment * modify payload_t * fix bug for alibi mode * fix alibi test err * refactor kernel launch, support select timer * add missing file * remove useless code * add some comments
-
- 23 May, 2024 3 commits
-
-
Joseph Macaranas authored
-
Illia Silin authored
* split the gemm_multi_abd instances * update the dates
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.2 to 1.1.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.2...v1.1.3 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 22 May, 2024 3 commits
-
-
Max Podkorytov authored
Also bundle the CK library and include files with the pip package. The package is pip-installable with `pip install git+https://github.com/tenpercent/composable_kernel@enable-pip` (substitute the repo path and branch if necessary) Testing: `myenv/bin/python3 -m ck4inductor.universal_gemm.gen_instances` (prints a list of instances) `tree myenv/lib/python3.12/site-packages/ck4inductor` (observe the list of sources along the installed package)
-
Bartłomiej Kocot authored
* Optimize grouped conv bwd weight for small M and N * Fixes
-
Illia Silin authored
* set individual gpu targets for instances, examples, tests * fix path to hip compiler * fix path to hip compiler once more * aggregate device macros in ck_tile config header * fix the cmake logic for instances * fix clang format * add gfx900 and gfx906 to default set of targets
-
- 21 May, 2024 1 commit
-
-
Rostyslav Geyyer authored
* Move grouped conv fwd client examples * Update existing examples * Format
-
- 20 May, 2024 1 commit
-
-
Illia Silin authored
-
- 17 May, 2024 3 commits
-
-
Illia Silin authored
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.1 to 1.1.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.1...v1.1.2 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
rocking authored
error: no viable conversion from returned value of type '__half' to function return type 'fp16_hip_t' (aka '_Float16') Co-authored-by:carlushuang <carlus.huang@amd.com>
-
- 15 May, 2024 3 commits
-
-
Illia Silin authored
-
carlushuang authored
-
jakpiase authored
* add unit tests for grouped gemm two stage * add reviewers suggestions --------- Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-