• zjing14's avatar
    Grouped GEMM for fp16 (#126) · 716f1c7f
    zjing14 authored
    * init of grouped_gemm
    
    * 2 gemm test
    
    * perf test
    
    * clean
    
    * wrap desc into a struct
    
    * test cast static_arr to pointer
    
    * add ptr to GemmDesc
    
    * add grouped gemm profiler
    
    * fixed mem issue with unique_ptr
    
    * clean
    
    * clean
    
    * finished ckprofiler
    
    * Update README.md
    
    * readme
    
    * fixed readme
    
    * add example
    
    * improve code
    
    * fixed comments: reserve, seperate ptr and gemm_shapes
    
    * merge group and non-group
    
    * fixed comments: replace push_back with emplace_back to avoid copy constructor
    
    * fixed comments: unified blk2ctile; add test
    
    * ci fix
    
    * fixed ci
    
    * fixed ci
    
    * fixed ci
    716f1c7f
grouped_gemm_fp16.cpp 7.26 KB