Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
1cfbbc42
"git@developer.sourcefind.cn:OpenDAS/torch-cluster.git" did not exist on "d678ae8284d7ddcb9ddeff8f9d6d6029237d3d3e"
Unverified
Commit
1cfbbc42
authored
Nov 04, 2025
by
Johnsonms
Committed by
GitHub
Nov 04, 2025
Browse files
[Bug] Fix NSA Backend KV-Buffer Shape Mismatch in DeepSeek-V3.2 (#12645)
parent
55dfb539
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
1 deletion
+3
-1
python/sglang/srt/mem_cache/memory_pool.py
python/sglang/srt/mem_cache/memory_pool.py
+3
-1
No files found.
python/sglang/srt/mem_cache/memory_pool.py
View file @
1cfbbc42
...
@@ -1568,6 +1568,7 @@ class MLATokenToKVPool(KVCache):
...
@@ -1568,6 +1568,7 @@ class MLATokenToKVPool(KVCache):
class
NSATokenToKVPool
(
MLATokenToKVPool
):
class
NSATokenToKVPool
(
MLATokenToKVPool
):
quant_block_size
=
128
quant_block_size
=
128
index_k_with_scale_buffer_dtype
=
torch
.
uint8
index_k_with_scale_buffer_dtype
=
torch
.
uint8
rope_storage_dtype
=
torch
.
bfloat16
# rope is always stored in bf16
def
__init__
(
def
__init__
(
self
,
self
,
...
@@ -1589,10 +1590,11 @@ class NSATokenToKVPool(MLATokenToKVPool):
...
@@ -1589,10 +1590,11 @@ class NSATokenToKVPool(MLATokenToKVPool):
# Calculate override_kv_cache_dim for FP8 storage:
# Calculate override_kv_cache_dim for FP8 storage:
# kv_lora_rank + scale storage (kv_lora_rank // quant_block_size * 4 bytes) + rope dimension storage
# kv_lora_rank + scale storage (kv_lora_rank // quant_block_size * 4 bytes) + rope dimension storage
# Note: rope dimension is stored in original dtype (bf16), not quantized to fp8
override_dim
=
(
override_dim
=
(
kv_lora_rank
kv_lora_rank
+
kv_lora_rank
//
self
.
quant_block_size
*
4
+
kv_lora_rank
//
self
.
quant_block_size
*
4
+
qk_rope_head_dim
*
dtype
.
itemsize
+
qk_rope_head_dim
*
self
.
rope_storage_
dtype
.
itemsize
)
)
super
().
__init__
(
super
().
__init__
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment