added lds double buffer (on C dimension) for implicit gemm v1r3, as a result,...
added lds double buffer (on C dimension) for implicit gemm v1r3, as a result, it should achieve 90% of peak for all filter sizes, on CHWN format
Showing
Please register or sign in to comment