Composable Kernel (CK) library aims to provide a programming model for writing performance critical kernels for machine learning workloads across multiple architectures including GPUs, CPUs, etc, through general purpose kernel languages, like HIP C++.
The Composable Kernel (CK) library provides a programming model for writing performance-critical
kernels for machine learning workloads across multiple architectures (GPUs, CPUs, etc.). The CK library
uses general purpose kernel languages, such as HIP C++.
CK uses two concepts to achieve performance portability and code maintainability:
CK utilizes two concepts to achieve performance portability and code maintainability:
* A tile-based programming model
* A tile-based programming model
* Algorithm complexity reduction for complex ML operators, using innovative technique we call "Tensor Coordinate Transformation".
* Algorithm complexity reduction for complex machine learning (ML) operators. This uses an innovative
technique called *Tensor Coordinate Transformation*.
You may need to clean up the build folder and repeat the cmake and make steps in order to take advantage of the sccache
during the subsequent builds.
You may need to clean up the build folder and repeat the cmake and make steps in order to take
advantage of the sccache during subsequent builds.
## Using CK as pre-built kernel library
## Using CK as pre-built kernel library
Instructions for using CK as a pre-built kernel library are under[client_example](/client_example)
You can find instructions for using CK as a pre-built kernel library in[client_example](/client_example).
## Contributing
## Contributing to CK
When you contribute to Composable Kernel, make sure to run `clang-format` on all the changed files. We highly recommend using git hooks that are managed by the `pre-commit` framework. To install hooks, run:
When you contribute to CK, make sure you run `clang-format` on all changed files. We highly
recommend using git hooks that are managed by the `pre-commit` framework. To install hooks, run:
```bash
```bash
sudo script/install_precommit.sh
sudo script/install_precommit.sh
```
```
This way, `pre-commit` will add the appropriate hooks to your local repository and automatically run `clang-format` (and possibly additional checks) before any commit is created.
With this approach, `pre-commit` adds the appropriate hooks to your local repository and
automatically runs `clang-format` (and possibly additional checks) before any commit is created.
If you need to uninstall hooks from the repository, you can do so by running the following command:
If you need to uninstall hooks from the repository, you can do so by running the following command:
...
@@ -181,14 +191,14 @@ If you need to uninstall hooks from the repository, you can do so by running the
...
@@ -181,14 +191,14 @@ If you need to uninstall hooks from the repository, you can do so by running the
script/uninstall_precommit.sh
script/uninstall_precommit.sh
```
```
If for any reason, you need to temporarily disable precommit hooks, you can add the `--no-verify` option to the `git commit` command.
If you need to temporarily disable pre-commit hooks, you can add the `--no-verify` option to the
`git commit` command.
## Caveat
## Caveat
### Kernel Timing and Verification
CK's own kernel timer will warn upkernel once, and then run it multiple times
**Kernel Timing and Verification**: CK's own kernel timer will warn up-kernel once, and then run it
to get average kernel time. For some kernels that use atomic add, this will cause
multiple times to get average kernel time. For some kernels that use atomic add, this causes the
output buffer to be accumulated multiple times, causing verification failure.
output buffer to be accumulated multiple times, causing verification failure. To work around this, don't
To work around it, do not use CK's own timer and do verification at the same time.
use CK's own timer and do verification at the same time.
CK's own timer and verification in each example and ckProfiler can be enabled or
CK's own timer and verification in each example and ckProfiler can be enabled or