"vscode:/vscode.git/clone" did not exist on "99c9752e003be68a68a5f711505492b4e4fb1a12"
  • Chao Liu's avatar
    Add gridwise GEMM pipeline (#89) · 22d438ae
    Chao Liu authored
    * clean up
    
    * add mutilple thread scratch to ThreadwiseTensorSliceTransfer_v3r1
    
    * add 2 stage prefetch
    
    * add more sanity check into transform_tensor_descriptor
    
    * tweak
    
    * enabling 2 stage prefetch to exsiting gridwise gemm; tweak
    
    * enabling 2 stage prefetch to exsiting gridwise gemm
    
    * move gridwise gemm pipeline in class; clean up
    
    * add some irregular tile size
    
    * update CalculateHasMainK0BlockLoop for multi-stage-prefetch
    
    * refactor gridwise gemm pipeline class
    22d438ae
gemm_xdl.cpp 16.4 KB