[Misc] Remove dangling references to `--use-v2-block-manager` (#13492)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

[Misc] Remove dangling references to `--use-v2-block-manager` (#13492)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
00b69c2d · Harry Mellor · GitHub · 4c822298 · 00b69c2d · 00b69c2d
Unverified Commit 00b69c2d authored Feb 19, 2025 by Harry Mellor Committed by GitHub Feb 19, 2025
Showing with 2 additions and 3 deletions

.buildkite/nightly-benchmarks/tests/serving-tests.json .buildkite/nightly-benchmarks/tests/serving-tests.json +1 -2

docs/source/features/spec_decode.md docs/source/features/spec_decode.md +1 -1

No files found.
--- a/.buildkite/nightly-benchmarks/tests/serving-tests.json
+++ b/.buildkite/nightly-benchmarks/tests/serving-tests.json
@@ -66,8 +66,7 @@
            "swap_space": 16, 
            "speculative_model": "turboderp/Qwama-0.5B-Instruct",
            "num_speculative_tokens": 4,
-            "speculative_draft_tensor_parallel_size": 1,
+            "speculative_draft_tensor_parallel_size": 1
-            "use_v2_block_manager": ""
        },
        "client_parameters": {
            "model": "meta-llama/Meta-Llama-3.1-70B-Instruct",

--- a/docs/source/features/spec_decode.md
+++ b/docs/source/features/spec_decode.md
@@ -45,7 +45,7 @@ To perform the same with an online mode launch the server:
 ```bash
 python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 8000 --model facebook/opt-6.7b \
-    --seed 42 -tp 1 --speculative_model facebook/opt-125m --use-v2-block-manager \
+    --seed 42 -tp 1 --speculative_model facebook/opt-125m \
    --num_speculative_tokens 5 --gpu_memory_utilization 0.8
 ```