Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
9e6562a3
Unverified
Commit
9e6562a3
authored
Dec 09, 2025
by
Woosuk Kwon
Committed by
GitHub
Dec 09, 2025
Browse files
[Model Runner V2] Fix Triton warning on tl.where (#30355)
Signed-off-by:
Woosuk Kwon
<
woosuk.kwon@berkeley.edu
>
parent
0b6a8a30
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
0 deletions
+1
-0
vllm/v1/worker/gpu/sample/penalties.py
vllm/v1/worker/gpu/sample/penalties.py
+1
-0
No files found.
vllm/v1/worker/gpu/sample/penalties.py
View file @
9e6562a3
...
@@ -62,6 +62,7 @@ def _penalties_and_temperature_kernel(
...
@@ -62,6 +62,7 @@ def _penalties_and_temperature_kernel(
mask
=
packed_block
<
tl
.
cdiv
(
vocab_size
,
32
),
mask
=
packed_block
<
tl
.
cdiv
(
vocab_size
,
32
),
)
)
prompt_bin_mask
=
(
packed_mask
[:,
None
]
>>
(
tl
.
arange
(
0
,
32
)[
None
,
:]))
&
1
prompt_bin_mask
=
(
packed_mask
[:,
None
]
>>
(
tl
.
arange
(
0
,
32
)[
None
,
:]))
&
1
prompt_bin_mask
=
prompt_bin_mask
.
to
(
tl
.
int1
)
prompt_bin_mask
=
prompt_bin_mask
.
reshape
(
BLOCK_SIZE
)
prompt_bin_mask
=
prompt_bin_mask
.
reshape
(
BLOCK_SIZE
)
# If token appears in prompt or output, apply, otherwise use 1.0 for no-op.
# If token appears in prompt or output, apply, otherwise use 1.0 for no-op.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment