Commits · 35e7bb5ba22ffd5ab0d827b08e6dbabf4dd3e055 · gaoqiong / composable_kernel_ROCM

05 Aug, 2024 5 commits
- Merge branch 'develop' into jizhan/enable_bf16_atomic_add · 35e7bb5b
  Illia Silin authored Aug 05, 2024
  
  35e7bb5b
- fix clang format · b3866b02
  illsilin authored Aug 05, 2024
  
  b3866b02
- add --offload-compress compiler flag (#1433) · 7f57b2e0
  Illia Silin authored Aug 05, 2024
```
* add --offload-compress compiler flag

* only apply the --offload-compress flag to the ckProfiler

* move the --offload-compress flag back to main cmake file

* add offload-compress to target compile option of ckProfiler

---------
Co-authored-by: carlushuang <carlus.huang@amd.com>
```
  7f57b2e0
- Merge branch 'develop' into jizhan/enable_bf16_atomic_add · 7e62f118
  Illia Silin authored Aug 05, 2024
  
  7e62f118
- [CI][Jenkins] delete CI docker container upon exit (#1437) · f31ba04a
  Illia Silin authored Aug 05, 2024
  
  f31ba04a
02 Aug, 2024 5 commits
- add guards · 7d69eb3b
  Jing Zhang authored Aug 01, 2024
  
  7d69eb3b
- clean · a1cd282e
  Jing Zhang authored Aug 01, 2024
  
  a1cd282e
- Merge branch 'jizhan/enable_bf16_atomic_add' of... · f6fdb74b
  Jing Zhang authored Aug 01, 2024
```
Merge branch 'jizhan/enable_bf16_atomic_add' of github.com:zjing14/composable_kernel into jizhan/enable_bf16_atomic_add
```
  f6fdb74b
- clean · 79ac8751
  Jing Zhang authored Aug 01, 2024
  
  79ac8751
- Merge branch 'develop' into jizhan/enable_bf16_atomic_add · 73de444f
  Illia Silin authored Aug 01, 2024
  
  73de444f
01 Aug, 2024 2 commits
- Add compiler flags for ROCm versions 6.2+ (#1429) · d311c953
  Illia Silin authored Aug 01, 2024
```
* add compiler flags to fix compiler issues

* fix typo.

* disable test_smfmac_op on all devices except gfx942

* specify full path to compiler in CI
```
  d311c953
- Merge branch 'develop' into jizhan/enable_bf16_atomic_add · 2df33268
  zjing14 authored Jul 31, 2024
  
  2df33268
31 Jul, 2024 20 commits
- format · ff47f28c
  Jing Zhang authored Jul 31, 2024
  
  ff47f28c
- fixed naming · 8d74dcac
  Jing Zhang authored Jul 31, 2024
  
  8d74dcac
- format · bbb29a9d
  Jing Zhang authored Jul 31, 2024
  
  bbb29a9d
- add ckProfiler · 32380a27
  Jing Zhang authored Jul 31, 2024
  
  32380a27
- format · 1675a341
  Jing Zhang authored Jul 31, 2024
  
  1675a341
- Merge branch 'jizhan/enable_bf16_atomic_add' of... · 6f8858bf
  Jing Zhang authored Jul 31, 2024
```
Merge branch 'jizhan/enable_bf16_atomic_add' of github.com:zjing14/composable_kernel into jizhan/enable_bf16_atomic_add
```
  6f8858bf
- enabled splitk_gemm_multi_d · ed2d5e40
  Jing Zhang authored Jul 31, 2024
  
  ed2d5e40
- Update gtest.cmake · 65fb572d
  zjing14 authored Jul 31, 2024
  
  65fb572d
- add guards · 0fff2a66
  Jing Zhang authored Jul 31, 2024
  
  0fff2a66
- clean · 35e61bf6
  Jing Zhang authored Jul 31, 2024
  
  35e61bf6
- clean · f7120342
  Jing Zhang authored Jul 31, 2024
  
  f7120342
- clang-format-12 · f5ea85f3
  Jing Zhang authored Jul 31, 2024
  
  f5ea85f3
- format · c70aacd3
  Jing Zhang authored Jul 31, 2024
  
  c70aacd3
- added bf16 atomic_add · f9b8a5d0
  Jing Zhang authored Jul 31, 2024
  
  f9b8a5d0
- fixed global_atomic_add · b0f295cb
  Jing Zhang authored Jul 31, 2024
  
  b0f295cb
- replace buffer_atomic with global_atomic · 895e8c40
  Jing Zhang authored Jul 31, 2024
  
  895e8c40
- Update doc requirements (#1423) · 6648fd3b
  Sam Wu authored Jul 31, 2024
  
  6648fd3b
- [HotFix] Fixed a typo in profile_gemm_multiply_multiply (#1425) · f31e8dfa
  zjing14 authored Jul 31, 2024
```
* fixed a typo

* clean

---------
Co-authored-by: Jing Zhang <jizhan@fb.com>
```
  f31e8dfa
- Codegen: isSupportedArgument check (#1417) · d32997a7
  arai713 authored Jul 31, 2024
```
* added isSupportedArgument check into codegen device op

* adding function call

* remove commented code
```
  d32997a7
- workaround rocm-6.2 compiler issue (#1421) · b3f86e79
  carlushuang authored Jul 31, 2024
  
  b3f86e79
30 Jul, 2024 2 commits
- add docker for rocm6.2_rc4 compiler (#1424) · b527cad4
  Illia Silin authored Jul 30, 2024
  
  b527cad4
- Revert Revert Support access per groups and filter2x3 in grouped conv fwd (#1382) (#1406) (#1415) · 33b399cc
  Bartłomiej Kocot authored Jul 30, 2024
  
  33b399cc
26 Jul, 2024 2 commits

Bump rocm-docs-core from 1.6.0 to 1.6.1 in /docs/sphinx (#1420) · b9ba5b26

dependabot[bot] authored Jul 26, 2024

Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.6.0 to 1.6.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.6.0...v1.6.1

)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

b9ba5b26

Introduce cmake USE_GLIBCXX_ASSERTIONS option (#1404) · 733f33af

trixirt authored Jul 25, 2024



A standard option in Fedora packaging that is used to check
the correctness of c++ use of the standard c++ library.
Signed-off-by: Tom Rix <trix@redhat.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>

733f33af

25 Jul, 2024 2 commits

Add rotating buff for gemm_multi_d (#1411) · 105bd708

zjing14 authored Jul 25, 2024



* add rotating_buff for gemm_multi_d

* format

* Update flush_cache.hpp

* Update gtest.cmake

---------
Co-authored-by: Jing Zhang <jizhan@fb.com>
Co-authored-by: Haocong WANG <haocwang@amd.com>

105bd708

Bump rocm-docs-core from 1.5.1 to 1.6.0 in /docs/sphinx (#1416) · 1208082e

dependabot[bot] authored Jul 24, 2024

Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.5.1 to 1.6.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.1...v1.6.0

)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

1208082e

24 Jul, 2024 2 commits

Adding more instances of grouped convolution 3d forward for FP8 with... · 4a8a1bef

Andriy Roshchenko authored Jul 24, 2024

Adding more instances of grouped convolution 3d forward for FP8 with ConvScale+Bias element-wise operation. (#1412)

* Add CMakePresets configurations.

* Add binary elementwise ConvScaleAdd and an example.

* Numerical verification of results.

Observed significant irregularities in F8 to F32 type conversions:
```log
ConvScaleAdd: float=145.000000   f8_t=160.000000    e=144.000000
ConvScaleAdd: float=97.000000   f8_t=96.000000    e=104.000000
ConvScaleAdd: float=65.000000   f8_t=64.000000    e=72.000000
```

* Implemented ConvScaleAdd + Example.

* Add ConvScale+Bias Instances

* Add Client Example for ConvScale+Bias

* Fix number of bytes in an example..

* Cleanup.

4a8a1bef

Add support for half_t and bfloat to reduction operations (#1395) · ffabd70a
Bartłomiej Kocot authored Jul 24, 2024
```
* Add support for half_t and bfloat to reduction operations

* Fix bhalf convert

* Next fix bf16
```
ffabd70a