Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
a4e2b268
Unverified
Commit
a4e2b268
authored
Jan 08, 2025
by
Jie Fu (傅杰)
Committed by
GitHub
Jan 07, 2025
Browse files
[Bugfix] Significant performance drop on CPUs with --num-scheduler-steps > 1 (#11794)
parent
973f5dc5
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
0 deletions
+6
-0
vllm/engine/arg_utils.py
vllm/engine/arg_utils.py
+6
-0
No files found.
vllm/engine/arg_utils.py
View file @
a4e2b268
...
@@ -1157,6 +1157,12 @@ class EngineArgs:
...
@@ -1157,6 +1157,12 @@ class EngineArgs:
if
self
.
enable_chunked_prefill
and
self
.
pipeline_parallel_size
>
1
:
if
self
.
enable_chunked_prefill
and
self
.
pipeline_parallel_size
>
1
:
raise
ValueError
(
"Multi-Step Chunked-Prefill is not supported "
raise
ValueError
(
"Multi-Step Chunked-Prefill is not supported "
"for pipeline-parallel-size > 1"
)
"for pipeline-parallel-size > 1"
)
from
vllm.platforms
import
current_platform
if
current_platform
.
is_cpu
():
logger
.
warning
(
"Multi-Step (--num-scheduler-steps > 1) is "
"currently not supported for CPUs and has been "
"disabled."
)
self
.
num_scheduler_steps
=
1
# make sure num_lookahead_slots is set the higher value depending on
# make sure num_lookahead_slots is set the higher value depending on
# if we are using speculative decoding or multi-step
# if we are using speculative decoding or multi-step
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment