[Perf] Optimize cutlass moe problem size calculation, 5.3% E2E Throughput...
[Perf] Optimize cutlass moe problem size calculation, 5.3% E2E Throughput improvement, 2.2% TTFT improvement (#31830) Signed-off-by:yewentao256 <zhyanwentao@126.com> Signed-off-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com>
Showing
Please register or sign in to comment