Pass the CUDA stream into the CUTLASS GEMMs, to avoid future issues with CUDA graphs
Attach a file by drag & drop or click to upload