Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
76e6a951
Unverified
Commit
76e6a951
authored
Dec 23, 2025
by
Wentao Ye
Committed by
GitHub
Dec 24, 2025
Browse files
[Bug] Fix `Number of dimensions of tensors must match.` for Deepseek V3.2 (#31160)
Signed-off-by:
yewentao256
<
zhyanwentao@126.com
>
parent
8b59753c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
3 deletions
+6
-3
vllm/model_executor/models/deepseek_v2.py
vllm/model_executor/models/deepseek_v2.py
+6
-3
No files found.
vllm/model_executor/models/deepseek_v2.py
View file @
76e6a951
...
...
@@ -878,11 +878,14 @@ class Indexer(nn.Module):
)
q_pe
,
k_pe
=
rotary_emb
(
positions
,
q_pe
,
k_pe
.
unsqueeze
(
1
))
# `rotary_emb` is shape-preserving; `q_pe` is already
# [num_tokens, n_head, rope_dim].
# Note: RoPE (NeoX) can introduce extra leading dimensions during compilation
# so we need to reshape back to token-flattened shapes
q_pe
=
q_pe
.
reshape
(
-
1
,
self
.
n_head
,
self
.
rope_dim
)
k_pe
=
k_pe
.
reshape
(
-
1
,
1
,
self
.
rope_dim
)
q
=
torch
.
cat
([
q_pe
,
q_nope
],
dim
=-
1
)
# `k_pe` is [num_tokens, 1, rope_dim] (MQA).
k
=
torch
.
cat
([
k_pe
.
squeeze
(
1
),
k_nope
],
dim
=-
1
)
k
=
torch
.
cat
([
k_pe
.
squeeze
(
-
2
),
k_nope
],
dim
=-
1
)
# we only quant q here since k quant is fused with cache insertion
q
=
q
.
view
(
-
1
,
self
.
head_dim
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment