Unverified Commit e86221de authored by simone-dotolo's avatar simone-dotolo Committed by GitHub
Browse files

[Doc] Fix GPU Worker count in Process Count Summary (#36000)


Signed-off-by: default avatarsimone-dotolo <simonedotolo@libero.it>
Signed-off-by: default avatarsimone-dotolo <84937474+simone-dotolo@users.noreply.github.com>
Co-authored-by: default avatargemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
parent 289fc48a
......@@ -122,7 +122,7 @@ For a deployment with `N` GPUs, `TP` tensor parallel size, `DP` data parallel si
|---|---|---|
| API Server | `A` (default `DP`) | Handles HTTP requests and input processing |
| Engine Core | `DP` (default 1) | Scheduler and KV cache management |
| GPU Worker | `N` (= `DP x TP`) | One per GPU, executes model forward passes |
| GPU Worker | `N` (= `DP x PP x TP`) | One per GPU, executes model forward passes |
| DP Coordinator | 1 if `DP > 1`, else 0 | Load balancing across DP ranks |
| **Total** | **`A + DP + N` (+ 1 if DP > 1)** | |
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment