Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
f7cb8c7b
Commit
f7cb8c7b
authored
Jan 04, 2026
by
jujl1
Browse files
fix: 只有当kv block中不含有MTP的假数据时才会被cached,以修复cache_full_blocks同一个kv block保存两次的bug
parent
2c1de3fa
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
2 deletions
+4
-2
vllm/v1/core/single_type_kv_cache_manager.py
vllm/v1/core/single_type_kv_cache_manager.py
+4
-2
No files found.
vllm/v1/core/single_type_kv_cache_manager.py
View file @
f7cb8c7b
...
...
@@ -10,7 +10,7 @@ from vllm.v1.core.kv_cache_utils import BlockHash, KVCacheBlock
from
vllm.v1.kv_cache_interface
import
(
FullAttentionSpec
,
KVCacheSpec
,
MambaSpec
,
SlidingWindowSpec
)
from
vllm.v1.request
import
Request
from
vllm
import
envs
class
SingleTypeKVCacheManager
(
ABC
):
"""
...
...
@@ -141,7 +141,9 @@ class SingleTypeKVCacheManager(ABC):
"""
num_cached_blocks
=
self
.
num_cached_block
[
request
.
request_id
]
num_full_blocks
=
num_tokens
//
self
.
block_size
if
envs
.
VLLM_ZERO_OVERHEAD_ENHANCE
:
if
num_full_blocks
>
num_cached_blocks
and
num_tokens
%
self
.
block_size
<
len
(
request
.
spec_token_ids
):
num_full_blocks
-=
1
self
.
block_pool
.
cache_full_blocks
(
request
=
request
,
blocks
=
self
.
req_to_blocks
[
request
.
request_id
],
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment