Enable device map (#30870)

* added_no_split_modules * added LlavaNextVisionAttention to _no_split_modules

Enable device map (#30870)
* added_no_split_modules * added LlavaNextVisionAttention to _no_split_modules
3802e786 · Darshana S · GitHub · 57c965a8 · 3802e786
Unverified Commit 3802e786 authored May 17, 2024 by Darshana S Committed by GitHub May 17, 2024
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 0 deletions

src/transformers/models/video_llava/modeling_video_llava.py src/transformers/models/video_llava/modeling_video_llava.py +1 -0

No files found.
--- a/src/transformers/models/video_llava/modeling_video_llava.py
+++ b/src/transformers/models/video_llava/modeling_video_llava.py
@@ -124,6 +124,7 @@ class VideoLlavaPreTrainedModel(PreTrainedModel):
    supports_gradient_checkpointing = True
    _skip_keys_device_placement = "past_key_values"
    _supports_flash_attn_2 = True
+    _no_split_modules = ["VideoLlavaVisionAttention"]

    def _init_weights(self, module):
        # important: this ported version of VideoLlava isn't meant for training from scratch - only