Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
30132cd1
Unverified
Commit
30132cd1
authored
Feb 21, 2026
by
Xiao Li
Committed by
GitHub
Feb 21, 2026
Browse files
Fix apply_top_k_top_p_triton called by non-cuda logits Tensor (#35030)
Signed-off-by:
Xiao Li
<
ilx@meta.com
>
parent
cbd95a2d
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
vllm/v1/sample/ops/topk_topp_sampler.py
vllm/v1/sample/ops/topk_topp_sampler.py
+1
-1
No files found.
vllm/v1/sample/ops/topk_topp_sampler.py
View file @
30132cd1
...
@@ -248,7 +248,7 @@ def apply_top_k_top_p(
...
@@ -248,7 +248,7 @@ def apply_top_k_top_p(
if
p
is
None
and
k
is
None
:
if
p
is
None
and
k
is
None
:
return
logits
return
logits
if
HAS_TRITON
and
logits
.
shape
[
0
]
>=
8
:
if
HAS_TRITON
and
logits
.
shape
[
0
]
>=
8
and
logits
.
is_cuda
:
return
apply_top_k_top_p_triton
(
logits
,
k
,
p
)
return
apply_top_k_top_p_triton
(
logits
,
k
,
p
)
# Use pytorch sort implementation for small batch sizes.
# Use pytorch sort implementation for small batch sizes.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment