Unverified Commit bbeb2808 authored by Alec's avatar Alec Committed by GitHub
Browse files

fix(vllm): warn that stream interval is not respected for now (#4650)


Signed-off-by: default avataralec-flowers <aflowers@nvidia.com>
Signed-off-by: default avatarAlec <35311602+alec-flowers@users.noreply.github.com>
parent f26dbd09
...@@ -225,6 +225,12 @@ def parse_args() -> Config: ...@@ -225,6 +225,12 @@ def parse_args() -> Config:
args.enable_local_indexer = str(args.enable_local_indexer).lower() == "true" args.enable_local_indexer = str(args.enable_local_indexer).lower() == "true"
engine_args = AsyncEngineArgs.from_cli_args(args) engine_args = AsyncEngineArgs.from_cli_args(args)
if hasattr(engine_args, "stream_interval") and engine_args.stream_interval != 1:
logger.warning(
"--stream-interval is currently not respected in Dynamo. "
"Dynamo uses its own post-processing implementation on the frontend, "
"bypassing vLLM's OutputProcessor buffering. "
)
# Workaround for vLLM GIL contention bug with NIXL connector when using UniProcExecutor. # Workaround for vLLM GIL contention bug with NIXL connector when using UniProcExecutor.
# With TP=1, vLLM defaults to UniProcExecutor which runs scheduler and worker in the same # With TP=1, vLLM defaults to UniProcExecutor which runs scheduler and worker in the same
# process. This causes a hot loop in _process_engine_step that doesn't release the GIL, # process. This causes a hot loop in _process_engine_step that doesn't release the GIL,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment