Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
110a6598
Unverified
Commit
110a6598
authored
Aug 22, 2025
by
datdo-msft
Committed by
GitHub
Aug 22, 2025
Browse files
[MTP] Force greedy sampling on AMD (#9127)
parent
49f9d025
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
2 deletions
+11
-2
python/sglang/srt/speculative/eagle_utils.py
python/sglang/srt/speculative/eagle_utils.py
+11
-2
No files found.
python/sglang/srt/speculative/eagle_utils.py
View file @
110a6598
...
...
@@ -49,6 +49,8 @@ SIMULATE_ACC_METHOD = os.environ.get("SIMULATE_ACC_METHOD", "multinomial")
TREE_TRAVERSE_TIME_THRESHOLD
=
1
# TODO: set this properly
TREE_SPEC_KERNEL_AVAILABLE
=
"tree_speculative_sampling_target_only"
in
globals
()
@
dataclass
class
EagleDraftInput
:
...
...
@@ -423,8 +425,15 @@ class EagleVerifyInput:
logits
=
logits_output
.
next_token_logits
,
vocab_mask
=
vocab_mask
)
# Sample tokens
if
batch
.
sampling_info
.
is_all_greedy
:
# Sample tokens. Force greedy sampling on AMD
is_all_greedy
=
sampling_info
.
is_all_greedy
if
(
not
is_all_greedy
)
and
(
not
TREE_SPEC_KERNEL_AVAILABLE
):
logger
.
warning
(
"Tree speculative sampling kernel unavailable (likely AMD/HIP build). "
"Falling back to greedy verification."
)
if
is_all_greedy
or
not
TREE_SPEC_KERNEL_AVAILABLE
:
target_predict
=
torch
.
argmax
(
logits_output
.
next_token_logits
,
dim
=-
1
)
target_predict
=
target_predict
.
reshape
(
bs
,
self
.
draft_token_num
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment