docs: removed all TODOs from public facing docs within repo (#3540)

Signed-off-by: Maksim Khadkevich <mkhadkevich@nvidia.com>

docs: removed all TODOs from public facing docs within repo (#3540)
Signed-off-by: Maksim Khadkevich <mkhadkevich@nvidia.com>
60ba7b25 · Maksim Khadkevich · GitHub · 90dc7589 · 60ba7b25 · 60ba7b25
Unverified Commit 60ba7b25 authored Oct 13, 2025 by Maksim Khadkevich Committed by GitHub Oct 13, 2025
Show whitespace changes
Inline Side-by-side

Showing with 0 additions and 5 deletions

deploy/metrics/README.md deploy/metrics/README.md +0 -2

docs/backends/vllm/multi-node.md docs/backends/vllm/multi-node.md +0 -3

No files found.
--- a/deploy/metrics/README.md
+++ b/deploy/metrics/README.md
@@ -163,8 +163,6 @@ $ python -m dynamo.vllm --model Qwen/Qwen3-0.6B  \
 - **HTTP Queue**: Measures queuing time before processing begins (including prefill time)
 - **HTTP Queue ≤ Inflight** (HTTP queue is a subset of inflight time)

-¹ **TODO**: Implement the "actual" HTTP queue metric that tracks from request start until first token generation begins, rather than the current implementation that tracks until first token is received by the frontend
-
 ### Required Files

 The following configuration files should be present in this directory:

--- a/docs/backends/vllm/multi-node.md
+++ b/docs/backends/vllm/multi-node.md
@@ -95,9 +95,6 @@ python -m dynamo.vllm \
  --is-prefill-worker
 ```

-
-## TODO
-
 ## Large Model Deployment

 For models requiring more GPUs than available on a single node such as tensor-parallel-size 16: