Merge pull request #77 from rietmann-nv/mr/bwd-channel-permute-experiments
Optimized CUDA kernels for S2 Attention (forward and backward)
Showing
Please register or sign in to comment
Optimized CUDA kernels for S2 Attention (forward and backward)