[2/2] Use moe_sum_reduce cuda kernel (#10654)
Co-authored-by:luoyuan.luo <luoyuan.luo@antgroup.com> Co-authored-by:
huangtingwei <141888744+huangtingwei9988@users.noreply.github.com>
Showing
Please register or sign in to comment
Co-authored-by:luoyuan.luo <luoyuan.luo@antgroup.com> Co-authored-by:
huangtingwei <141888744+huangtingwei9988@users.noreply.github.com>