[Perf] Use FlashInfer RoPE for RotaryEmbedding.forward_cuda when available (#21126)
Signed-off-by:mgoin <mgoin64@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
Showing
Please register or sign in to comment