"...composable_kernel.git" did not exist on "6717168c18428c80fdd257c9ab9e619eeaa4ebbd"
Improve Reduction kernel api (#152)
* Add ThreadwiseReduction functor as per-thread reduction api * Using ThreadwiseReduce api and some change in using PartitionedBlockwiseReduction api to simply the kernels * Add comments and remove useless declarations in the kernels * Tiny updates
Showing
Please register or sign in to comment