[Spec Decode] Reduce TP communication for speculative decoding draft token generation (#34049)
Signed-off-by:qizixi <qizixi@meta.com> Co-authored-by:
Lu Fang <30275821+houseroad@users.noreply.github.com>
Showing
Please register or sign in to comment