README.md 1.57 KB
Newer Older
Xiaowei.zhang's avatar
Xiaowei.zhang committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# CK batched_gemm bf16 tune

1. Install aiter:  
`cd $aiter_path`  
`python3 setup.py develop`

2. Add GEMM shapes in `aiter/configs/bf16_untuned_batched_gemm.csv`
    |**B**|**M**|**N**|**K**|
    |-----|-----|-----|-----|
    |16   |128  |1536 |7168 |


3. Start tuning: 
Run the following cmd to start tuning, run the following cmd to start tuning, please wait a few minutes as it will build batched_gemm_bf16_tune via jit:  
`python3 csrc/ck_batched_gemm_bf16/batched_gemm_bf16_tune.py -i aiter/configs/bf16_untuned_batched_gemm.csv -o aiter/configs/bf16_tuned_batched_gemm.csv`  
You can find the results of the tuning in `aiter/configs/bf16_tuned_batched_gemm.csv`.

4. Build tuned kernels and test:
Test the performance, modify the test instance in `op_tests/test_batched_gemm_bf16.py` and run it, please wait a few minutes as it will build batched_gemm_bf16 tuned kernels in `aiter/configs/bf16_tuned_batched_gemm.csv` via jit:  
`python3 op_tests/test_batched_gemm_bf16.py`
If you have built batched_gemm_bf16 kernels brefore tuning new GEMM shapes, please add `AITER_REBUILD=1` before your test cmd, such as `AITER_REBUILD=1 python3 op_tests/test_batched_gemm_bf16.py`. It will rebuild kernels from `aiter/configs/bf16_tuned_batched_gemm.csv`.

## More
If you use flag `PREBUILD_KERNELS=1` when you install aiter, it will build batched_gemm_bf16 kernels in tuned gemm csv by default. If you want to use the new result of batched_gemm_bf16_tune, please remove `build` and `*.so` in `aiter/jit` first, then re-intall aiter after finishing tune. This can take a lot of time and is not recommended.