FA3 Spec Decoding to support top k = 1 and add cuda graph support (#5050)
Co-authored-by:Qingquan Song <ustcsqq@gmail.com> Co-authored-by:
Chunan Zeng <zcnrex@gmail.com>
Showing
This diff is collapsed.
Please register or sign in to comment