• Chao Liu's avatar
    DL GEMM fp32/fp16/int8 (#41) · b8b2d0a6
    Chao Liu authored
    * add threadwise copy the copy a tensor in one copy, added kpack to DL GEMM
    
    * add kpack into fwd v4r5 nchw fp32
    b8b2d0a6
common_header.hpp 1014 Bytes