"vscode:/vscode.git/clone" did not exist on "9289e577ec185bd9feb2c03bb86b82f1bf9bb633"
Unverified Commit 2094062b authored by youkaichao's avatar youkaichao Committed by GitHub
Browse files

[4.5/N] bugfix for quant config in speculative decode (#10007)


Signed-off-by: default avataryoukaichao <youkaichao@gmail.com>
parent d93478b3
...@@ -61,6 +61,10 @@ def create_spec_worker(*args, **kwargs) -> "SpecDecodeWorker": ...@@ -61,6 +61,10 @@ def create_spec_worker(*args, **kwargs) -> "SpecDecodeWorker":
draft_worker_config = copy.deepcopy(vllm_config) draft_worker_config = copy.deepcopy(vllm_config)
draft_worker_config.model_config = speculative_config.draft_model_config draft_worker_config.model_config = speculative_config.draft_model_config
draft_worker_config.quant_config = VllmConfig._get_quantization_config(
draft_worker_config.model_config,
vllm_config.load_config,
)
draft_worker_config.parallel_config = speculative_config.draft_parallel_config # noqa draft_worker_config.parallel_config = speculative_config.draft_parallel_config # noqa
# TODO allow draft-model specific load config. # TODO allow draft-model specific load config.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment