Unverified Commit 7be9ffcd authored by Roger Wang's avatar Roger Wang Committed by GitHub
Browse files

[Misc] Fix Qwen3-VL `video_grid_thw` typing (#25646)


Signed-off-by: default avatarRoger Wang <hey@rogerw.io>
parent 393de22d
...@@ -1249,7 +1249,7 @@ class Qwen3VLForConditionalGeneration(nn.Module, SupportsMultiModal, ...@@ -1249,7 +1249,7 @@ class Qwen3VLForConditionalGeneration(nn.Module, SupportsMultiModal,
rope_type="rope_3d") rope_type="rope_3d")
else: else:
video_embeds = self.visual(pixel_values_videos, video_embeds = self.visual(pixel_values_videos,
grid_thw=grid_thw) grid_thw=grid_thw_list)
# Split concatenated embeddings for each video item. # Split concatenated embeddings for each video item.
# Using prod on grid_thw_list instead of grid_thw.prod avoids CUDA sync # Using prod on grid_thw_list instead of grid_thw.prod avoids CUDA sync
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment