"...git@developer.sourcefind.cn:yangql/composable_kernel.git" did not exist on "98e1e2d0e933499d4342cf66686d6aa130dda925"
Add gridwise GEMM pipeline (#89)
* clean up * add mutilple thread scratch to ThreadwiseTensorSliceTransfer_v3r1 * add 2 stage prefetch * add more sanity check into transform_tensor_descriptor * tweak * enabling 2 stage prefetch to exsiting gridwise gemm; tweak * enabling 2 stage prefetch to exsiting gridwise gemm * move gridwise gemm pipeline in class; clean up * add some irregular tile size * update CalculateHasMainK0BlockLoop for multi-stage-prefetch * refactor gridwise gemm pipeline class
Showing
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment