Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
dd70437a
Unverified
Commit
dd70437a
authored
Sep 26, 2025
by
Icey
Committed by
GitHub
Sep 26, 2025
Browse files
Remove cuda hard-code in compute_causal_conv1d_metadata (#25555)
Signed-off-by:
Icey
<
1790571317@qq.com
>
parent
99b3a504
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
2 deletions
+3
-2
vllm/v1/attention/backends/utils.py
vllm/v1/attention/backends/utils.py
+3
-2
No files found.
vllm/v1/attention/backends/utils.py
View file @
dd70437a
...
...
@@ -947,6 +947,7 @@ def compute_causal_conv1d_metadata(query_start_loc_p: torch.Tensor):
nums_dict
=
{}
# type: ignore
batch_ptr
=
None
token_chunk_offset_ptr
=
None
device
=
query_start_loc_p
.
device
for
BLOCK_M
in
[
8
]:
# cover all BLOCK_M values
nums
=
-
(
-
seqlens
//
BLOCK_M
)
nums_dict
[
BLOCK_M
]
=
{}
...
...
@@ -968,11 +969,11 @@ def compute_causal_conv1d_metadata(query_start_loc_p: torch.Tensor):
batch_ptr
=
torch
.
full
((
MAX_NUM_PROGRAMS
,
),
PAD_SLOT_ID
,
dtype
=
torch
.
int32
,
device
=
'cuda'
)
device
=
device
)
token_chunk_offset_ptr
=
torch
.
full
((
MAX_NUM_PROGRAMS
,
),
PAD_SLOT_ID
,
dtype
=
torch
.
int32
,
device
=
'cuda'
)
device
=
device
)
else
:
if
batch_ptr
.
nelement
()
<
MAX_NUM_PROGRAMS
:
batch_ptr
.
resize_
(
MAX_NUM_PROGRAMS
).
fill_
(
PAD_SLOT_ID
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment