Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
770e5dcd
Unverified
Commit
770e5dcd
authored
Jun 09, 2025
by
Yinghai Lu
Committed by
GitHub
Jun 09, 2025
Browse files
[full_graph] Fix query_start_loc padding (#19321)
Signed-off-by:
Yinghai Lu
<
yinghai@thinkingmachines.ai
>
parent
c57c9415
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/v1/worker/gpu_model_runner.py
vllm/v1/worker/gpu_model_runner.py
+4
-1
No files found.
vllm/v1/worker/gpu_model_runner.py
View file @
770e5dcd
...
@@ -655,7 +655,10 @@ class GPUModelRunner(LoRAModelRunnerMixin):
...
@@ -655,7 +655,10 @@ class GPUModelRunner(LoRAModelRunnerMixin):
# Fill unused with -1. Needed for reshape_and_cache
# Fill unused with -1. Needed for reshape_and_cache
self
.
seq_lens
[
num_reqs
:].
fill_
(
0
)
self
.
seq_lens
[
num_reqs
:].
fill_
(
0
)
self
.
query_start_loc
[
num_reqs
+
1
:].
fill_
(
-
1
)
# Note: pad query_start_loc to be non-decreasing, as kernels
# like FlashAttention requires that
self
.
query_start_loc
[
num_reqs
+
1
:].
fill_
(
self
.
query_start_loc_cpu
[
num_reqs
].
item
())
query_start_loc
=
self
.
query_start_loc
[:
num_reqs
+
1
]
query_start_loc
=
self
.
query_start_loc
[:
num_reqs
+
1
]
seq_lens
=
self
.
seq_lens
[:
num_reqs
]
seq_lens
=
self
.
seq_lens
[:
num_reqs
]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment