-
Benjamin Chislett authored
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (#28479) Signed-off-by:Benjamin Chislett <bchislett@nvidia.com>
30441957
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (#28479)
Signed-off-by:
Benjamin Chislett <bchislett@nvidia.com>