"...include/device_reduce_instance_threadwise.hpp" did not exist on "12dfba3d03f402c051e2129fa21f33264f4d26e5"
Update to gemm_reduce and batched_gemm_reduce (#213)
* [Experimental] Change to gemm+reduce and batched-gemm+reduce * Use threadwise-reduce function to improve the gridwise_gemm_reduce_xdl_cshuffle kernel * Tiny fix in device_batched_gemm_xdl.hpp * clang-format library/src/utility/conv_fwd_util.cpp
Showing
Please register or sign in to comment