"src/include/gridwise_direct_convolution_1.hip.hpp" did not exist on "28354a0fa374f71ceeb72ddccf09796701981b3c"
  • Qianfeng's avatar
    Update to gemm_reduce and batched_gemm_reduce (#213) · c77ae65d
    Qianfeng authored
    * [Experimental] Change to gemm+reduce and batched-gemm+reduce
    
    * Use threadwise-reduce function to improve the gridwise_gemm_reduce_xdl_cshuffle kernel
    
    * Tiny fix in device_batched_gemm_xdl.hpp
    
    * clang-format library/src/utility/conv_fwd_util.cpp
    c77ae65d
profile_batched_gemm_reduce_impl.hpp 15 KB