Merge pull request #77 from rietmann-nv/mr/bwd-channel-permute-experiments
Optimized CUDA kernels for S2 Attention (forward and backward)
Showing
This diff is collapsed.
Please register or sign in to comment
Optimized CUDA kernels for S2 Attention (forward and backward)