Improve kernel code generation (#1285)
* Only run __syncthreads when there is data to preload * Improve loops * Add const attribute to improve optimizations
Showing
Please register or sign in to comment
* Only run __syncthreads when there is data to preload * Improve loops * Add const attribute to improve optimizations