1. 29 Jul, 2022 1 commit
    • Chao Liu's avatar
      Clean up conv example, Instances, profiler and test (#324) · 500fa995
      Chao Liu authored
      * convnd_fwd fp16 example
      
      * update example
      
      * update example
      
      * update instance
      
      * updating refernce conv
      
      * update reference conv
      
      * update conv fwd profiler
      
      * update conv 1d and 3d instance
      
      * update include path
      
      * clean
      
      * update profiler for conv bwd data and weight
      
      * update conv bwd weight
      
      * clean
      
      * update conv example
      
      * update profiler for conv bwd weight
      
      * update ckprofiler for conv bwd data
      
      * fix reference conv bwd data bug; update conv bwd data test
      
      * update examples
      
      * fix initialization issue
      
      * update test for conv fwd
      
      * clean
      
      * clean
      
      * remove test case too sensitive to error threshhold
      
      * fix test
      
      * clean
      
      * fix build
      
      * adding conv multiple d
      
      * adding conv multiple D
      
      * add matrix padder
      
      * add gemm padding to convnd
      
      * adding group conv
      
      * update gemm multi-d
      
      * refactor
      
      * refactor
      
      * refactor
      
      * clean
      
      * clean
      
      * refactor
      
      * refactor
      
      * reorg
      
      * add ds
      
      * add bias
      
      * clean
      
      * add G
      
      * adding group
      
      * adding group
      
      * adding group
      
      * update Tensor
      
      * clean
      
      * update example
      
      * update DeviceGemmMultipleD_Xdl_CShuffle
      
      * update conv bwd-data and bwd-weight
      
      * upate contraction example
      
      * update gemm and batch gemm with e permute
      
      * fix example build
      
      * instance for grouped conv1d
      
      * update example
      
      * adding group conv instance
      
      * update gemm bilinear instance
      
      * update gemm+add+add+fastgelu instance
      
      * update profiler
      
      * update profiler
      
      * update test
      
      * update test and client example
      
      * clean
      
      * add grouped conv into profiler
      
      * update profiler
      
      * clean
      
      * add test grouped conv, update all conv test to gtest
      
      * update test
      500fa995
  2. 22 Mar, 2022 1 commit
    • zjing14's avatar
      Grouped GEMM for fp16 (#126) · 716f1c7f
      zjing14 authored
      * init of grouped_gemm
      
      * 2 gemm test
      
      * perf test
      
      * clean
      
      * wrap desc into a struct
      
      * test cast static_arr to pointer
      
      * add ptr to GemmDesc
      
      * add grouped gemm profiler
      
      * fixed mem issue with unique_ptr
      
      * clean
      
      * clean
      
      * finished ckprofiler
      
      * Update README.md
      
      * readme
      
      * fixed readme
      
      * add example
      
      * improve code
      
      * fixed comments: reserve, seperate ptr and gemm_shapes
      
      * merge group and non-group
      
      * fixed comments: replace push_back with emplace_back to avoid copy constructor
      
      * fixed comments: unified blk2ctile; add test
      
      * ci fix
      
      * fixed ci
      
      * fixed ci
      
      * fixed ci
      716f1c7f