- 17 Dec, 2024 3 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 16 Dec, 2024 5 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
- 13 Dec, 2024 2 commits
-
-
zjing14 authored
-
Jing Zhang authored
-
- 11 Dec, 2024 15 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
Jing Zhang authored
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
zjing14 authored
Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
- 06 Dec, 2024 7 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Bartłomiej Kocot authored
* Support large batch tensors in grouped conv bwd data * Fix multiD * fixes * fixes * fixes
-
Po Yen Chen authored
-
Illia Silin authored
* upgrade to rocm6.3 compiler * Proposed solution to convnd test failures in ROCm 6.3 --------- Co-authored-by:Andriy Roshchenko <andriy.roshchenko@amd.com>
-
- 05 Dec, 2024 2 commits
-
-
jakpiase authored
* add IsSupportedArgument to gemm_kernel * add ut and do some refactoring * switched to ck_tile's integral_constant
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.10.0 to 1.11.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.10.0...v1.11.0 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 04 Dec, 2024 2 commits
-
-
Mateusz Ozga authored
* Ck-tile, impl. grouped gemm * Workspace is allocated by user, and is passed to the function * Prepare test to new api design * Unify GemTransKernelArgs, removing N0 param * Add 1 to dim3 in paritioner * Typo: gem - > gemm --------- Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
Po Yen Chen authored
* Use 'false' for highest dimension padding flags * Update padding flag of bias
-
- 03 Dec, 2024 2 commits
-
-
Bartłomiej Kocot authored
* Add basic documentation structure * Add terminology placeholder * Add codegen placeholder * Create template for each page
-
Illia Silin authored
* (2/5) bilinear gemm pass, perf bug: skip a lds has lower performance than skip b lds * (3/5) batched gemm pass, perf bug: skip a lds has lower performance than skip b lds * (4/5) grouped conv pass * (5/5) attention pass, todo: debug lds perf bug * AIT Attention API refactor (#8) * sanity pass * sanity pass 2 * confirm significant performance regression. * turn on all instances * turn off instance format * Fix bug & tunning & format * DML meta, self_attn+cross_attn * sanity pass * remove useless flag * update tile and problem size used in AIT attention * bug fix in grouped conv supporting check * deprecate inline asm wmma * Bug fix: double lds skip * clang-format * Fix errors in 1. example, fmha 2. gridwise pipeline 3. deviceop, fmha, change some containers from vector to array * part2 of previous commit * clang format * API fix of gridwisegemmpipeline * separate array base and vector base attention...
-
- 02 Dec, 2024 2 commits
-
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.9.2 to 1.10.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.9.2...v1.10.0 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
rtmadduri authored
* LWPCK-2429: Device grouped GEMM uses Async Memcpy Resolving merge conflicts * reverting changes to profile_grouped_gemm * revert date change --------- Co-authored-by:Illia Silin <98187287+illsilin@users.noreply.github.com>
-