Reduce CUDA driver API calls when choosing transpose kernels Signed-off-by: Tim Moon <tmoon@nvidia.com>
Attach a file by drag & drop or click to upload