Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
e18464a5
Unverified
Commit
e18464a5
authored
Jan 09, 2026
by
Wentao Ye
Committed by
GitHub
Jan 10, 2026
Browse files
[Perf] Optimize async scheduling placeholder using empty (#32056)
Signed-off-by:
yewentao256
<
zhyanwentao@126.com
>
parent
1963245e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/v1/engine/output_processor.py
vllm/v1/engine/output_processor.py
+4
-1
No files found.
vllm/v1/engine/output_processor.py
View file @
e18464a5
...
...
@@ -31,6 +31,9 @@ from vllm.v1.metrics.stats import (
SchedulerStats
,
)
# shared empty CPU tensor used as a placeholder pooling output
EMPTY_CPU_TENSOR
=
torch
.
empty
(
0
,
device
=
"cpu"
)
class
RequestOutputCollector
:
"""
...
...
@@ -426,7 +429,7 @@ class OutputProcessor:
new_token_ids
=
[],
# Set pooling_output is not None to
# correctly enter the abort pooling branch
pooling_output
=
torch
.
randn
(
0
,
device
=
"cpu"
)
pooling_output
=
EMPTY_CPU_TENSOR
if
req_state
.
detokenizer
is
None
else
None
,
finish_reason
=
FinishReason
.
ABORT
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment