Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
a3a5a47e
Unverified
Commit
a3a5a47e
authored
Jul 12, 2025
by
Richard Zou
Committed by
GitHub
Jul 11, 2025
Browse files
[Bugfix] Fix torch.compile x LoRA for PyTorch 2.8 (#20823)
Signed-off-by:
rzou
<
zou3519@gmail.com
>
parent
fb25e956
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
8 additions
and
6 deletions
+8
-6
vllm/lora/layers.py
vllm/lora/layers.py
+8
-6
No files found.
vllm/lora/layers.py
View file @
a3a5a47e
...
@@ -240,17 +240,19 @@ class VocabParallelEmbeddingWithLoRA(BaseLayerWithLoRA):
...
@@ -240,17 +240,19 @@ class VocabParallelEmbeddingWithLoRA(BaseLayerWithLoRA):
def
forward
(
self
,
x
:
torch
.
Tensor
)
->
torch
.
Tensor
:
def
forward
(
self
,
x
:
torch
.
Tensor
)
->
torch
.
Tensor
:
added_tokens_mask
=
torch
.
where
(
x
>
self
.
base_layer
.
org_vocab_size
-
1
,
added_tokens_mask
=
torch
.
where
(
x
>
self
.
base_layer
.
org_vocab_size
-
1
,
1
,
0
)
1
,
0
)
embeddings_indices
=
torch
.
narrow
(
self
.
punica_wrapper
.
_embeddings_indices
,
1
,
0
,
x
.
size
(
0
))
indices
=
embeddings_indices
[
1
]
# NB: Don't use torch.narrow here. torch.narrow triggers some
# Dynamic Shape specialization in torch.compile
num_tokens
=
x
.
shape
[
0
]
indices_1
=
self
.
punica_wrapper
.
_embeddings_indices
[
1
][:
num_tokens
]
indices_0
=
self
.
punica_wrapper
.
_embeddings_indices
[
0
][:
num_tokens
]
full_lora_a_embeddings
=
F
.
embedding
(
full_lora_a_embeddings
=
F
.
embedding
(
x
+
indices
,
x
+
indices
_1
,
self
.
lora_a_stacked_2d
,
self
.
lora_a_stacked_2d
,
)
)
indices
=
embeddings_indices
[
0
]
full_output
=
self
.
base_layer
.
forward
(
x
+
full_output
=
self
.
base_layer
.
forward
(
x
+
(
indices
*
added_tokens_mask
))
(
indices
_0
*
added_tokens_mask
))
full_output_org
=
full_output
full_output_org
=
full_output
if
full_output
.
ndim
==
3
:
if
full_output
.
ndim
==
3
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment