Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
97588c4d
Unverified
Commit
97588c4d
authored
Nov 24, 2025
by
Woosuk Kwon
Committed by
GitHub
Nov 24, 2025
Browse files
[Model Runner V2] Add minor clarification comments for Eagle (#29332)
Signed-off-by:
Woosuk Kwon
<
woosuk.kwon@berkeley.edu
>
parent
839c6b7b
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
0 deletions
+11
-0
vllm/v1/worker/gpu/spec_decode/eagle.py
vllm/v1/worker/gpu/spec_decode/eagle.py
+11
-0
No files found.
vllm/v1/worker/gpu/spec_decode/eagle.py
View file @
97588c4d
...
...
@@ -65,6 +65,12 @@ class EagleSpeculator:
# [num_reqs]
next_prefill_tokens
:
torch
.
Tensor
,
)
->
torch
.
Tensor
:
# NOTE(woosuk): To avoid CPU-GPU synchronization without CPU knowing the
# number of rejected tokens, we maintain the size of eagle's input_ids and
# hidden_states the same as the target model's. This means, we pad each
# request's query length to include any rejected positions. By doing so,
# we can also reuse the attention metadata (e.g., query_start_loc,
# seq_lens) of the target model.
if
aux_hidden_states
:
assert
self
.
method
==
"eagle3"
hidden_states
=
self
.
model
.
combine_hidden_states
(
...
...
@@ -110,6 +116,11 @@ class EagleSpeculator:
# NOTE(woosuk): We must add 1 to the positions to match the Gumbel noise
# used for draft and target sampling.
pos
=
input_batch
.
positions
[
last_token_indices
]
+
1
# NOTE(woosuk): For draft sampling, we only consider the temperature
# and ignore the other sampling parameters such as top_k and top_p,
# for simplicity and performance.
# While this may slightly degrade the acceptance rate, it does not
# affect the output distribution after rejection sampling.
draft_tokens
=
gumbel_sample
(
logits
,
temperature
,
seed
,
pos
,
apply_temperature
=
True
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment