Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
6dd94dbe
Unverified
Commit
6dd94dbe
authored
Jan 24, 2025
by
youkaichao
Committed by
GitHub
Jan 24, 2025
Browse files
[perf] fix perf regression from #12253 (#12380)
Signed-off-by:
youkaichao
<
youkaichao@gmail.com
>
parent
0e74d797
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/worker/model_runner.py
vllm/worker/model_runner.py
+4
-1
No files found.
vllm/worker/model_runner.py
View file @
6dd94dbe
...
...
@@ -455,7 +455,6 @@ class ModelInputForGPUBuilder(ModelRunnerInputBuilderBase[ModelInputForGPU]):
self
.
enable_prompt_adapter
=
(
self
.
runner
.
prompt_adapter_config
is
not
None
)
self
.
multi_modal_input_mapper
=
self
.
runner
.
multi_modal_input_mapper
self
.
decode_only
=
True
# Attention metadata inputs.
if
self
.
attn_backend
is
not
None
:
...
...
@@ -477,6 +476,10 @@ class ModelInputForGPUBuilder(ModelRunnerInputBuilderBase[ModelInputForGPU]):
finished_requests_ids
:
Optional
[
List
[
str
]]
=
None
)
->
None
:
self
.
finished_requests_ids
=
finished_requests_ids
# if the current batch is decode-only.
# will be set to False if there is any non-decode request.
self
.
decode_only
=
True
# Intermediate data (data in CPU before going to GPU) for
# the current sequence group.
self
.
inter_data_list
:
List
[
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment