"vllm/executor/ray_xpu_executor.py" did not exist on "973617ae02a4e8e6190674cf1cdb0c0803b65ae6"
Unverified Commit 4a5299c9 authored by Tomas Ruiz's avatar Tomas Ruiz Committed by GitHub
Browse files

feat: spec decode with draft models (#24322)


Signed-off-by: default avatarTomas Ruiz <tomas.ruiz.te@gmail.com>
parent 73f2a81c
...@@ -352,7 +352,7 @@ def bind_kv_cache( ...@@ -352,7 +352,7 @@ def bind_kv_cache(
pass pass
else: else:
raise NotImplementedError raise NotImplementedError
layer_name = layer_names[0] for layer_name in layer_names:
runner_kv_caches.append(kv_caches[layer_name]) runner_kv_caches.append(kv_caches[layer_name])
# Bind kv_caches to forward context # Bind kv_caches to forward context
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment