[Logging] Improve log for when DeepEP HT disables CUDA Graphs (#25531)

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

[Logging] Improve log for when DeepEP HT disables CUDA Graphs (#25531)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
fea80060 · Tyler Michael Smith · GitHub · e6750d0b · fea80060
Unverified Commit fea80060 authored Sep 24, 2025 by Tyler Michael Smith Committed by GitHub Sep 24, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 5 deletions

vllm/platforms/cuda.py vllm/platforms/cuda.py +6 -5

No files found.
--- a/vllm/platforms/cuda.py
+++ b/vllm/platforms/cuda.py
@@ -186,11 +186,12 @@ class CudaPlatformBase(Platform):
            # if torch compile cache key issue fixed
            # See https://github.com/vllm-project/vllm/pull/25093
            logger.info(
-                "Data Parallel: disabling cudagraphs since DP "
+                "WideEP: Disabling CUDA Graphs since DeepEP high-throughput "
-                "with DeepEP high-throughput kernels are not CUDA Graph "
+                "kernels are optimized for prefill and are incompatible with "
-                "compatible. The DeepEP low-latency kernels are CUDA Graph "
+                "CUDA Graphs. "
-                "compatible. Set the all_to_all backend to deepep_low_latency "
+                "In order to use CUDA Graphs for decode-optimized workloads, "
-                "to use those kernels instead.")
+                "set VLLM_ALL2ALL_BACKEND to another option, such as "
+                "deepep_low_latency, pplx, or allgather_reducescatter.")
            compilation_config.cudagraph_mode = CUDAGraphMode.NONE
    @classmethod