"megatron/vscode:/vscode.git/clone" did not exist on "89e8d27e4f749a6fe620798d45709bb050ed7abc"
  • zjing14's avatar
    Add batched/grouped_gemm contraction deviceOps (#349) · e08d68d2
    zjing14 authored
    
    
    * convnd_fwd fp16 example
    
    * update example
    
    * update example
    
    * update instance
    
    * updating refernce conv
    
    * update reference conv
    
    * update conv fwd profiler
    
    * update conv 1d and 3d instance
    
    * update include path
    
    * clean
    
    * update profiler for conv bwd data and weight
    
    * update conv bwd weight
    
    * clean
    
    * update conv example
    
    * update profiler for conv bwd weight
    
    * update ckprofiler for conv bwd data
    
    * fix reference conv bwd data bug; update conv bwd data test
    
    * update examples
    
    * fix initialization issue
    
    * update test for conv fwd
    
    * clean
    
    * clean
    
    * remove test case too sensitive to error threshhold
    
    * fix test
    
    * clean
    
    * fix build
    
    * adding conv multiple d
    
    * adding conv multiple D
    
    * add matrix padder
    
    * add gemm padding to convnd
    
    * adding group conv
    
    * update gemm multi-d
    
    * refactor
    
    * refactor
    
    * refactor
    
    * clean
    
    * clean
    
    * refactor
    
    * refactor
    
    * reorg
    
    * add ds
    
    * add bias
    
    * clean
    
    * add G
    
    * adding group
    
    * adding group
    
    * adding group
    
    * update Tensor
    
    * clean
    
    * update example
    
    * update DeviceGemmMultipleD_Xdl_CShuffle
    
    * update conv bwd-data and bwd-weight
    
    * upate contraction example
    
    * update gemm and batch gemm with e permute
    
    * fix example build
    
    * instance for grouped conv1d
    
    * update example
    
    * adding group conv instance
    
    * update gemm bilinear instance
    
    * update gemm+add+add+fastgelu instance
    
    * update profiler
    
    * update profiler
    
    * update test
    
    * update test and client example
    
    * clean
    
    * add grouped conv into profiler
    
    * update profiler
    
    * clean
    
    * add test grouped conv, update all conv test to gtest
    
    * update test
    
    * change gemm_c_permute with contraction
    
    * add grouped_contraction
    
    * add contraction in group_gemm
    
    * add example of grouped_gemm with contraction
    
    * add example of grouped_contraction_bias_e_permute
    
    * clean
    
    * fixed ds
    
    * add m3n2 m2n3 examples into gemm_bias_e_permute
    Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
    e08d68d2
CMakeLists.txt 1.75 KB