Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
4634cbcf
Commit
4634cbcf
authored
Jan 23, 2026
by
laibao
Browse files
fix: 修复 MTP runner 缺失 _extract_layer_index 导致的 KV 压缩崩溃
parent
863f93e6
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
21 additions
and
0 deletions
+21
-0
vllm/v1/worker/gpu_model_runner.py
vllm/v1/worker/gpu_model_runner.py
+21
-0
No files found.
vllm/v1/worker/gpu_model_runner.py
View file @
4634cbcf
...
...
@@ -427,6 +427,27 @@ class GPUModelRunnerBase(LoRAModelRunnerMixin):
if
self
.
enable_expert_parallel
and
self
.
dp_size
>
1
and
self
.
tp_size
>
1
:
self
.
ep_sp
=
True
@
staticmethod
def
_extract_layer_index
(
layer_name
:
str
)
->
int
:
"""Extract attention layer index from a module name.
KV compression prompt compaction (scheme 3) needs to map
`kv_cache_group_spec.layer_names` entries to indices in `self.kv_caches`.
"""
from
vllm.model_executor.models.utils
import
extract_layer_index
try
:
return
extract_layer_index
(
layer_name
)
except
Exception
as
e
:
# Be conservative: skip layers whose names don't follow the
# expected pattern instead of crashing the whole engine.
logger
.
warning_once
(
"Failed to parse layer index from layer name '%s': %s. "
"Skipping KV compaction for this layer."
,
layer_name
,
e
,
)
return
1
<<
30
def
_may_reorder_batch
(
self
,
scheduler_output
:
"SchedulerOutput"
)
->
None
:
"""
Update the order of requests in the batch based on the attention
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment