Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
79a5b632
Unverified
Commit
79a5b632
authored
Apr 17, 2026
by
Or Ozeri
Committed by
GitHub
Apr 17, 2026
Browse files
[kv_offload]: Fix num CPU blocks for UniformTypeKVCacheSpecs (#39617)
Signed-off-by:
Or Ozeri
<
oro@il.ibm.com
>
parent
c0c98b8b
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
11 deletions
+7
-11
vllm/v1/kv_offload/cpu/spec.py
vllm/v1/kv_offload/cpu/spec.py
+7
-11
No files found.
vllm/v1/kv_offload/cpu/spec.py
View file @
79a5b632
...
...
@@ -26,17 +26,13 @@ class CPUOffloadingSpec(OffloadingSpec):
# calculate kv_bytes_per_offloaded_block
assert
kv_cache_config
is
not
None
page_sizes
=
{
kv_cache_group
.
kv_cache_spec
.
page_size_bytes
for
kv_cache_group
in
kv_cache_config
.
kv_cache_groups
}
assert
len
(
page_sizes
)
==
1
page_size_bytes
=
page_sizes
.
pop
()
kv_bytes_per_block
=
(
page_size_bytes
*
len
(
kv_cache_config
.
kv_cache_tensors
)
*
vllm_config
.
parallel_config
.
world_size
)
if
kv_cache_config
.
num_blocks
>
0
:
total_gpu_kv_bytes
=
sum
(
t
.
size
for
t
in
kv_cache_config
.
kv_cache_tensors
)
kv_bytes_per_block
=
(
total_gpu_kv_bytes
//
kv_cache_config
.
num_blocks
)
*
vllm_config
.
parallel_config
.
world_size
else
:
kv_bytes_per_block
=
0
kv_bytes_per_offloaded_block
=
kv_bytes_per_block
*
self
.
block_size_factor
self
.
num_blocks
=
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment