Large kernel for implicit gemm (#547)
* large kernel bwd&bwdI, not test increment RS
* large kernel fix, no split_mask and increment rs
* large kernel fix2, no split_mask and increment rs
* reset benchmark.py
* fix merge
Co-authored-by:
EvernightAurora <2465542858@qq.com>
Showing
Please register or sign in to comment