composable_kernel/include/utility/common_header.hpp · b8b2d0a6d1f6342686ee890eac64a1506d865452 · yangql / composable_kernel-1

Chao Liu authored Jul 04, 2021

* add threadwise copy the copy a tensor in one copy, added kpack to DL GEMM

* add kpack into fwd v4r5 nchw fp32

b8b2d0a6

common_header.hpp 1014 Bytes