[Perf][Kernel] Add faster topKperRow decode kernel for DeepSeek-V3.2 sparse attention (#33680)
Signed-off-by:LopezCastroRoberto <rocastro@redhat.com> Signed-off-by:
Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com> Co-authored-by:
Claude Sonnet 4.5 <noreply@anthropic.com>
Showing
csrc/topk.cu
0 → 100644
Please register or sign in to comment