Unverified Commit a9213c0f authored by Cyrus Leung's avatar Cyrus Leung Committed by GitHub
Browse files

[Doc] Fix outdated reference to CUDAGraphManager (#38209)


Signed-off-by: default avatarDarkLight1337 <tlleungac@connect.ust.hk>
parent 502c41a8
...@@ -13,11 +13,11 @@ Encoder CUDA Graphs eliminate this overhead by pre-capturing the full encoder fo ...@@ -13,11 +13,11 @@ Encoder CUDA Graphs eliminate this overhead by pre-capturing the full encoder fo
## Design ## Design
The encoder CUDA Graph system uses a **budget-based capture/replay** strategy, managed by [EncoderCudaGraphManager][vllm.v1.worker.gpu.mm.encoder_cudagraph.EncoderCudaGraphManager]. The system contains the following core components: The encoder CUDA Graph system uses a **budget-based capture/replay** strategy, managed by [EncoderCudaGraphManager][vllm.v1.worker.encoder_cudagraph.EncoderCudaGraphManager]. The system contains the following core components:
* [EncoderCudaGraphManager][vllm.v1.worker.gpu.mm.encoder_cudagraph.EncoderCudaGraphManager]: orchestrates capture, replay, greedy packing, and data-parallel execution for encoder CUDA Graphs. * [EncoderCudaGraphManager][vllm.v1.worker.encoder_cudagraph.EncoderCudaGraphManager]: orchestrates capture, replay, greedy packing, and data-parallel execution for encoder CUDA Graphs.
* [SupportsEncoderCudaGraph][vllm.model_executor.models.interfaces.SupportsEncoderCudaGraph]: a runtime-checkable protocol that models implement to opt-in to encoder CUDA Graphs. * [SupportsEncoderCudaGraph][vllm.model_executor.models.interfaces.SupportsEncoderCudaGraph]: a runtime-checkable protocol that models implement to opt-in to encoder CUDA Graphs.
* [BudgetGraphMetadata][vllm.v1.worker.gpu.mm.encoder_cudagraph.BudgetGraphMetadata]: holds the captured CUDA Graph and its associated I/O buffers for a single token budget level. * [BudgetGraphMetadata][vllm.v1.worker.encoder_cudagraph.BudgetGraphMetadata]: holds the captured CUDA Graph and its associated I/O buffers for a single token budget level.
### Budget-based graph capture ### Budget-based graph capture
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment