Unverified Commit 1311913f authored by Woosuk Kwon's avatar Woosuk Kwon Committed by GitHub
Browse files

[BugFix][Spec Decode] No in-place update to draft probs (#16952)


Signed-off-by: default avatarWoosuk Kwon <woosuk.kwon@berkeley.edu>
parent 29f395c9
...@@ -264,7 +264,9 @@ def compute_probs_and_sample_next_token( ...@@ -264,7 +264,9 @@ def compute_probs_and_sample_next_token(
# TODO(woosuk): Consider seeds. # TODO(woosuk): Consider seeds.
q = torch.empty_like(probs) q = torch.empty_like(probs)
q.exponential_() q.exponential_()
next_token_ids = probs.div_(q).argmax(dim=-1).view(-1) # NOTE(woosuk): We shouldn't use `probs.div_(q)` because the draft_probs
# will be used later for rejection sampling.
next_token_ids = probs.div(q).argmax(dim=-1).view(-1)
if not sampling_metadata.all_random: if not sampling_metadata.all_random:
greedy_token_ids = probs.argmax(dim=-1) greedy_token_ids = probs.argmax(dim=-1)
next_token_ids = torch.where( next_token_ids = torch.where(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment