Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
9528e3a0
Unverified
Commit
9528e3a0
authored
Jul 06, 2025
by
Woosuk Kwon
Committed by
GitHub
Jul 06, 2025
Browse files
[BugFix][Spec Decode] Fix spec token ids in model runner (#20530)
Signed-off-by:
Woosuk Kwon
<
woosuk.kwon@berkeley.edu
>
parent
9fb52e52
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
12 additions
and
11 deletions
+12
-11
vllm/v1/worker/gpu_model_runner.py
vllm/v1/worker/gpu_model_runner.py
+12
-11
No files found.
vllm/v1/worker/gpu_model_runner.py
View file @
9528e3a0
...
...
@@ -528,18 +528,19 @@ class GPUModelRunner(LoRAModelRunnerMixin):
start_token_index
:
end_token_index
]
=
new_token_ids
self
.
input_batch
.
num_tokens_no_spec
[
req_index
]
=
end_token_index
self
.
input_batch
.
num_tokens
[
req_index
]
=
end_token_index
# Add spec_token_ids to token_ids_cpu.
spec_token_ids
=
(
scheduler_output
.
scheduled_spec_decode_tokens
.
get
(
req_id
,
()))
scheduler_output
.
scheduled_spec_decode_tokens
.
get
(
req_id
,
()))
if
spec_token_ids
:
start_index
=
end_token_index
end_token_index
+=
len
(
spec_token_ids
)
num_spec_tokens
=
len
(
spec_token_ids
)
start_index
=
self
.
input_batch
.
num_tokens_no_spec
[
req_index
]
end_token_index
=
start_index
+
num_spec_tokens
self
.
input_batch
.
token_ids_cpu
[
req_index
,
start_index
:
end_token_index
]
=
spec_token_ids
req_index
,
start_index
:
end_token_index
]
=
spec_token_ids
# NOTE(woosuk): `num_tokens` here may include spec tokens.
self
.
input_batch
.
num_tokens
[
req_index
]
=
end_token_index
self
.
input_batch
.
num_tokens
[
req_index
]
+
=
num_spec_tokens
# Add the new or resumed requests to the persistent batch.
# The smaller empty indices are filled first.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment