Unverified Commit e5fa8b62 authored by ishandhanani's avatar ishandhanani Committed by GitHub
Browse files

docs(sglang): remove note about placeholder metrics implementation (#2256)

parent c86052c9
...@@ -94,12 +94,6 @@ cd $DYNAMO_ROOT/components/backends/sglang ...@@ -94,12 +94,6 @@ cd $DYNAMO_ROOT/components/backends/sglang
### Aggregated Serving with KV Routing ### Aggregated Serving with KV Routing
> [!NOTE]
> The current implementation of `components/backends/sglang/src/dynamo/sglang/worker/main.py` publishes _placeholder_ engine metrics to keep the Dynamo KV-router happy. Real-time metrics will be surfaced directly from the SGLang engine once the following pull requests are merged:
> • Dynamo: [ai-dynamo/dynamo #1465](https://github.com/ai-dynamo/dynamo/pull/1465) – _feat: receive kvmetrics from sglang scheduler_.
>
> After these are in, the TODOs in `main.py` will be resolved and the placeholder logic removed.
```bash ```bash
cd $DYNAMO_ROOT/components/backends/sglang cd $DYNAMO_ROOT/components/backends/sglang
./launch/agg_router.sh ./launch/agg_router.sh
......
...@@ -86,7 +86,7 @@ if [ "$mode" = "prefill" ]; then ...@@ -86,7 +86,7 @@ if [ "$mode" = "prefill" ]; then
SGLANG_USE_MESSAGE_QUEUE_BROADCASTER=0 \ SGLANG_USE_MESSAGE_QUEUE_BROADCASTER=0 \
SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK=1 \ SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK=1 \
PYTHONUNBUFFERED=1 \ PYTHONUNBUFFERED=1 \
python3 components/worker.py \ python3 -m dynamo.sglang.worker \
--served-model-name deepseek-ai/DeepSeek-R1 \ --served-model-name deepseek-ai/DeepSeek-R1 \
--model-path /model/ \ --model-path /model/ \
--skip-tokenizer-init \ --skip-tokenizer-init \
...@@ -188,7 +188,7 @@ elif [ "$mode" = "decode" ]; then ...@@ -188,7 +188,7 @@ elif [ "$mode" = "decode" ]; then
SGLANG_USE_MESSAGE_QUEUE_BROADCASTER=0 \ SGLANG_USE_MESSAGE_QUEUE_BROADCASTER=0 \
SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK=1 \ SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK=1 \
PYTHONUNBUFFERED=1 \ PYTHONUNBUFFERED=1 \
python3 components/decode_worker.py \ python3 -m dynamo.sglang.decode_worker \
--served-model-name deepseek-ai/DeepSeek-R1 \ --served-model-name deepseek-ai/DeepSeek-R1 \
--model-path /model/ \ --model-path /model/ \
--skip-tokenizer-init \ --skip-tokenizer-init \
......
...@@ -70,7 +70,7 @@ fi ...@@ -70,7 +70,7 @@ fi
if [ "$mode" = "prefill" ]; then if [ "$mode" = "prefill" ]; then
if [ "$cmd" = "dynamo" ]; then if [ "$cmd" = "dynamo" ]; then
# H100 dynamo prefill command # H100 dynamo prefill command
python3 components/worker.py \ python3 -m dynamo.sglang.worker \
--model-path /model/ \ --model-path /model/ \
--served-model-name deepseek-ai/DeepSeek-R1 \ --served-model-name deepseek-ai/DeepSeek-R1 \
--skip-tokenizer-init \ --skip-tokenizer-init \
...@@ -131,7 +131,7 @@ if [ "$mode" = "prefill" ]; then ...@@ -131,7 +131,7 @@ if [ "$mode" = "prefill" ]; then
elif [ "$mode" = "decode" ]; then elif [ "$mode" = "decode" ]; then
if [ "$cmd" = "dynamo" ]; then if [ "$cmd" = "dynamo" ]; then
# H100 dynamo decode command # H100 dynamo decode command
python3 components/decode_worker.py \ python3 -m dynamo.sglang.decode_worker \
--model-path /model/ \ --model-path /model/ \
--served-model-name deepseek-ai/DeepSeek-R1 \ --served-model-name deepseek-ai/DeepSeek-R1 \
--skip-tokenizer-init \ --skip-tokenizer-init \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment