[Perf]Optimize rotary_emb implementation to use Triton operator for improved...
[Perf]Optimize rotary_emb implementation to use Triton operator for improved inference performance (#16457) Signed-off-by:cynthieye <yexin93@qq.com> Co-authored-by:
MagnetoWang <magnetowang@outlook.com>
Showing
Please register or sign in to comment