- 19 Nov, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 18 Nov, 2024 3 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 17 Nov, 2024 1 commit
-
-
Illia Silin authored
Merge from public
-
- 16 Nov, 2024 1 commit
-
-
illsilin authored
-
- 15 Nov, 2024 3 commits
-
-
dependabot[bot] authored
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.3 to 1.8.4. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.4/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.3...v1.8.4 ) --- updated-dependencies: - dependency-name: rocm-docs-core dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Illia Silin authored
-
Illia Silin authored
-
- 14 Nov, 2024 7 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
* Improve test verbosity. * BUGFIX: Add missing initialization for reduction buffer * Change default initialization method Performance may be affected for fp32 and int8 examples. * Improve test verbosity * Cleanup
-
feli authored
Co-authored-by:dummycoderfe <noplydummmycoder@163.com>
-
- 13 Nov, 2024 3 commits
-
-
Illia Silin authored
-
Taylor Ding authored
-
Bartłomiej Kocot authored
* [CK TILE] Update gemm universal pipeline * Fixes * fix * Rebase
-
- 12 Nov, 2024 4 commits
-
-
Andriy Roshchenko authored
The tests take too long to complete on the emulator. Need to see if it is possible to reduce the scope of the testing to just FP8 data types.
-
Andriy Roshchenko authored
-
Illia Silin authored
-
Thomas Ning authored
* Finished the feature * Modified the test file * Test case update * addresss comment * Addressed the review comment * Fixed the CI error
-
- 11 Nov, 2024 3 commits
-
-
Illia Silin authored
-
valarLip authored
* [CK_TILE] add more stride for layernorm to support un-continuous Tensor * align CK coding style * extend strides to layernrom expample * clang-format...
-
Po Yen Chen authored
-
- 09 Nov, 2024 2 commits
-
-
dummycoderfe authored
* add moe_sorting & check ok * fix comments & typo * Run remod.py under include/ck_tile & example/ck_tile directories * format codes * fix output ci check bug * fix moe sorting readme and error commit file * use magiv div to accelerate compute * add an loop unroll for moe lds ops * add extblocksnel to set zeros for moebufs * [Ck_tile] moe set zero run ok, add size check and fix ref check * [Ck_tile]fix moe_sorting fuse set_zero remod * [Ck_tile] change name style, fix zero buffer size err, change folder * [Ck_tile] moe_sorting: fix name style * [Ck_tile] moe_sorting, remove useless params in traits * [Ck_tile] change outputtile cnt * unit_size; change output buf alloc --------- Co-authored-by:
dummycoderfe <noplydummmycoder@163.com> Co-authored-by:
Po Yen, Chen <PoYen.Chen@amd.com> Co-authored-by:
carlushuang <carlus.huang@amd.com>
-
Po Yen Chen authored
-
- 08 Nov, 2024 4 commits
-
-
Bartłomiej Kocot authored
* Add generic instances for two stage conv bwd wei * Update layout prefix
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
dummycoderfe authored
* optimze small N case using vec io and using rcp div * [Ck_tile] layernorm, add param to control fastdiv; change generate codes and test pass * [Ck_tile] fix blockSize compute in Generic2dBlockShape * [Ck_tile]fix kfastfdiv template style * [Ck_tile] layernorm, fix stype in review --------- Co-authored-by:dummycoderfe <noplydummmycoder@163.com>
-
- 07 Nov, 2024 8 commits
-
-
Andriy Roshchenko authored
-
Illia Silin authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
splitk gemm appears to be losing precision VS reference implementation when FP numbers are involved.
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-