[Bug] Refactor max_num_batched_tokens to account for drafting (#34898)
Signed-off-by:
Benjamin Chislett <bchislett@nvidia.com>
Showing
Please register or sign in to comment
Signed-off-by:
Benjamin Chislett <bchislett@nvidia.com>