Unverified Commit 2fc57eb1 authored by Yan Ru Pei's avatar Yan Ru Pei Committed by GitHub
Browse files

docs: update vllm mermaid diagram (#4320)


Signed-off-by: default avatarPeaBrane <yanrpei@gmail.com>
parent ce833983
...@@ -86,18 +86,32 @@ This includes the specific commit [vllm-project/vllm#19790](https://github.com/v ...@@ -86,18 +86,32 @@ This includes the specific commit [vllm-project/vllm#19790](https://github.com/v
This figure shows an overview of the major components to deploy: This figure shows an overview of the major components to deploy:
``` ```mermaid
+------+ +-----------+ +------------------+ +---------------+ %%{init: {'theme':'base', 'themeVariables': { 'fontSize':'10px', 'primaryColor':'#2e8b57', 'primaryTextColor':'#fff', 'primaryBorderColor':'#333', 'lineColor':'#81b1db', 'secondaryColor':'#b35900', 'tertiaryColor':'#808080', 'edgeLabelBackground':'transparent'}}}%%
| HTTP |----->| dynamo |----->| vLLM Worker |------------>| vLLM Prefill | graph TD
| |<-----| ingress |<-----| |<------------| Worker | %% Node Definitions with custom shapes
+------+ +-----------+ +------------------+ +---------------+ HTTP[HTTP]
| ^ | ROUTER[Router]
query best | | return | publish kv events PREFILL[vLLM Prefill Worker]
worker | | worker_id v DECODE[vLLM Decode Worker]
| | +------------------+
| +---------| kv-router | %% Class Definitions for color
+------------->| | classDef worker_style fill:#2e8b57,stroke:#333,stroke-width:2px,color:#fff;
+------------------+ classDef router_style fill:#b35900,stroke:#333,stroke-width:2px,color:#fff;
%% Applying classes to nodes
class PREFILL,DECODE worker_style
class ROUTER router_style
%% Request/Response flow
HTTP <--> |"request/response"| ROUTER
ROUTER --> |"1. send to prefill"| PREFILL
PREFILL --> |"2. return NIXL metadata"| ROUTER
ROUTER --> |"3. send with metadata"| DECODE
DECODE --> |"4. stream response"| ROUTER
%% KV Events publishing
PREFILL -.-> |"publish kv events"| ROUTER
``` ```
Note: The above architecture illustrates all the components. The final components that get spawned depend upon the chosen deployment pattern. Note: The above architecture illustrates all the components. The final components that get spawned depend upon the chosen deployment pattern.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment