Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
2ec88272
Unverified
Commit
2ec88272
authored
Nov 15, 2024
by
Sky Lee
Committed by
GitHub
Nov 15, 2024
Browse files
[Bugfix] Qwen-vl output is inconsistent in speculative decoding (#10350)
parent
b40cf640
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
0 deletions
+2
-0
vllm/spec_decode/batch_expansion.py
vllm/spec_decode/batch_expansion.py
+2
-0
No files found.
vllm/spec_decode/batch_expansion.py
View file @
2ec88272
...
...
@@ -353,6 +353,7 @@ class BatchExpansionTop1Scorer(SpeculativeScorer):
seq_data
=
seq_group_metadata
.
seq_data
[
seq_id
]
prompt_token_ids
=
seq_data
.
prompt_token_ids_array
new_output_token_ids
=
[
*
seq_data
.
get_output_token_ids
(),
*
token_ids
]
mrope_position_delta
=
seq_data
.
mrope_position_delta
new_seq_data_dict
=
{
target_seq_id
:
...
...
@@ -368,6 +369,7 @@ class BatchExpansionTop1Scorer(SpeculativeScorer):
# the kv cache is filled by a previous batch in the batch expansion.
for
data
in
new_seq_data_dict
.
values
():
data
.
update_num_computed_tokens
(
data
.
get_len
()
-
1
)
data
.
mrope_position_delta
=
mrope_position_delta
return
SequenceGroupMetadata
(
request_id
=
seq_group_metadata
.
request_id
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment