Add VectorType support into StaticBuffer (#27)
* init StaticBufferV2
* clean
* adopt old output stage for staticBufferV2
* clean
* remove hack
* clean
* clean
* clean code
* move c_buffer alloc into blockwise gemm
* add adaptors for m/n_thread_data_on_grid
* adjust blockwise_gemm_xdlops
* reorder ops in GEMM hot loop
Co-authored-by:
Chao Liu <chao.liu2@amd.com>
Showing
Please register or sign in to comment