Unverified Commit 24b65eff authored by Chen Zhang's avatar Chen Zhang Committed by GitHub
Browse files

[BugFix] Spec decode with VLLM_ENABLE_V1_MULTIPROCESSING=0 (#30319)


Signed-off-by: default avatarChen Zhang <zhangch99@outlook.com>
parent 41b6f920
......@@ -268,7 +268,8 @@ class InprocClient(EngineCoreClient):
self.engine_core = EngineCore(*args, **kwargs)
def get_output(self) -> EngineCoreOutputs:
outputs, _ = self.engine_core.step_fn()
outputs, model_executed = self.engine_core.step_fn()
self.engine_core.post_step(model_executed=model_executed)
return outputs and outputs.get(0) or EngineCoreOutputs()
def get_supported_tasks(self) -> tuple[SupportedTask, ...]:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment