use __launch_bounds__ for multi_tensor_apply (#44)
use __launch_bounds__(1024) for multi_tensor_apply, re-enable skipped tests
Showing
Please register or sign in to comment
use __launch_bounds__(1024) for multi_tensor_apply, re-enable skipped tests