Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
467886a0
Unverified
Commit
467886a0
authored
Mar 03, 2026
by
Woosuk Kwon
Committed by
GitHub
Mar 03, 2026
Browse files
[Model Runner V2] Fix inputs_embeds=None bug for MM models (#35917)
Signed-off-by:
Woosuk Kwon
<
woosuk@inferact.ai
>
parent
a9b8b13e
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
1 deletion
+3
-1
vllm/v1/worker/gpu/model_runner.py
vllm/v1/worker/gpu/model_runner.py
+3
-1
No files found.
vllm/v1/worker/gpu/model_runner.py
View file @
467886a0
...
...
@@ -907,9 +907,11 @@ class GPUModelRunner(LoRAModelRunnerMixin):
)
inputs_embeds
=
None
if
self
.
supports_mm_inputs
and
self
.
is_first_pp_rank
and
not
dummy_run
:
if
self
.
supports_mm_inputs
and
self
.
is_first_pp_rank
:
# Run MM encoder (if needed) and get multimodal embeddings.
# Only first PP rank prepares multimodal embeddings.
# NOTE(woosuk): We must call get_mm_embeddings even during dummy runs
# to obtain inputs_embeds, because the compiled model expects this input.
inputs_embeds
=
self
.
model_state
.
get_mm_embeddings
(
scheduler_output
.
scheduled_encoder_inputs
,
input_batch
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment