"example/01_gemm/gemm_xdl_bf16.cpp" did not exist on "64350affc5767e7ce3fb211d8145b5c9d18017d8"
Multiple changes to global kernel function.
* StorePartials work on offseted pointer. * Read flags as uint32_t value * Accumulate partials only if there is more than one cooperating workgroup * Add condition for waiting on reduction end, only when there is still work to do. * Fix creation od a/b grid desc in CheckArgument. * LaunchKernel will use preprocess lambda to set flags value to zero. * Add condition in IsSupportedArgument to check if xdl is supported.
Showing
Please register or sign in to comment