Unverified Commit 638a872d authored by Yuxiang Liang's avatar Yuxiang Liang Committed by GitHub
Browse files

fix(xpu): Re-compute compile ranges after platform-specific config updates (#37523)


Signed-off-by: default avatarYuxiang Liang <yuxiang.liang@intel.com>
Signed-off-by: default avatarYuxiang Liang <yuliang@habana.ai>
Co-authored-by: default avatargemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
parent 9040151f
......@@ -985,8 +985,6 @@ class VllmConfig:
"--kv-sharing-fast-prefill requires changes on model side for "
"correctness and to realize prefill savings."
)
# TODO: Move after https://github.com/vllm-project/vllm/pull/26847 lands
self._set_compile_ranges()
if (
self.model_config
......@@ -1022,6 +1020,10 @@ class VllmConfig:
)
current_platform.check_and_update_config(self)
# Re-compute compile ranges after platform-specific config updates
# (e.g., XPU may lower max_num_batched_tokens when MLA is enabled)
self._set_compile_ranges()
# Do this after all the updates to compilation_config.mode
effective_dp_size = (
self.parallel_config.data_parallel_size
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment