Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
085b7b2d
Unverified
Commit
085b7b2d
authored
Feb 14, 2025
by
Nick Hill
Committed by
GitHub
Feb 14, 2025
Browse files
[V1] Simplify GPUModelRunner._update_states check (#13265)
parent
4da1f667
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
2 deletions
+4
-2
vllm/v1/worker/gpu_model_runner.py
vllm/v1/worker/gpu_model_runner.py
+4
-2
No files found.
vllm/v1/worker/gpu_model_runner.py
View file @
085b7b2d
...
...
@@ -347,6 +347,8 @@ class GPUModelRunner(LoRAModelRunnerMixin):
self
.
input_batch
.
block_table
.
append_row
(
req_index
,
start_index
,
req_data
.
new_block_ids
)
batch_changed
=
len
(
removed_req_indices
)
>
0
or
len
(
req_ids_to_add
)
>
0
# Add the new or resumed requests to the persistent batch.
# The smaller empty indices are filled first.
removed_req_indices
=
sorted
(
removed_req_indices
,
reverse
=
True
)
...
...
@@ -363,8 +365,8 @@ class GPUModelRunner(LoRAModelRunnerMixin):
# Condense the batched states if there are empty indices.
if
removed_req_indices
:
self
.
input_batch
.
condense
(
removed_req_indices
)
return
(
len
(
unscheduled_req_ids
)
>
0
or
len
(
req_ids_to_add
)
>
0
or
len
(
scheduler_output
.
finished_req_ids
)
>
0
)
return
batch_changed
def
_prepare_inputs
(
self
,
scheduler_output
:
"SchedulerOutput"
):
total_num_scheduled_tokens
=
scheduler_output
.
total_num_scheduled_tokens
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment