[Update] Use FlashInfer fast_decode_plan directly instead of replication (#34687)
Signed-off-by:Andrii <askliar@nvidia.com> Co-authored-by:
Andrii <askliar@nvidia.com>
Showing
Please register or sign in to comment
Signed-off-by:Andrii <askliar@nvidia.com> Co-authored-by:
Andrii <askliar@nvidia.com>