[`BLIP`] fix docstring for `BlipTextxxx` (#21224)

* fix `blip` docstring * fix typo * fix another typo

[`BLIP`] fix docstring for `BlipTextxxx` (#21224)
* fix `blip` docstring * fix typo * fix another typo
7fd902d3 · Younes Belkada · GitHub · d54d7598 · 7fd902d3
Unverified Commit 7fd902d3 authored Jan 20, 2023 by Younes Belkada Committed by GitHub Jan 20, 2023
Show whitespace changes
Inline Side-by-side

Showing with 17 additions and 26 deletions

src/transformers/models/blip/modeling_blip_text.py src/transformers/models/blip/modeling_blip_text.py +17 -26

No files found.
--- a/src/transformers/models/blip/modeling_blip_text.py
+++ b/src/transformers/models/blip/modeling_blip_text.py
@@ -679,22 +679,19 @@ class BlipTextModel(BlipTextPreTrainedModel):
        is_decoder=False,
    ):
        r"""
-        encoder_hidden_states  (:
+        encoder_hidden_states  (`torch.FloatTensor`, *optional*):
-            obj:*torch.FloatTensor* of shape `(batch_size, sequence_length, hidden_size)`, *optional*): Sequence of
+            Sequence of hidden-states at the output of the last layer of the encoder. Used in the cross-attention if
-            hidden-states at the output of the last layer of the encoder. Used in the cross-attention if the model is
+            the model is configured as a decoder.
-            configured as a decoder.
+        encoder_attention_mask (`torch.FloatTensor`, *optional*):
-        encoder_attention_mask (`torch.FloatTensor` of shape `(batch_size, sequence_length)`, *optional*):
            Mask to avoid performing attention on the padding token indices of the encoder input. This mask is used in
            the cross-attention if the model is configured as a decoder. Mask values selected in `[0, 1]`:
            - 1 for tokens that are **not masked**,
            - 0 for tokens that are **masked**.
-        past_key_values (:
+        past_key_values (`tuple(tuple(torch.FloatTensor))`, *optional*):
-            obj:*tuple(tuple(torch.FloatTensor))* of length `config.n_layers` with each tuple having 4 tensors of shape
+            Contains precomputed key and value hidden states of the attention blocks. Can be used to speed up decoding.
-            `(batch_size, num_heads, sequence_length - 1, embed_size_per_head)`): Contains precomputed key and value
+            If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
-            hidden states of the attention blocks. Can be used to speed up decoding. If `past_key_values` are used, the
+            don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
-            user can optionally input only the last `decoder_input_ids` (those that don't have their past key value
+            `decoder_input_ids` of shape `(batch_size, sequence_length)`.
-            states given to this model) of shape `(batch_size, 1)` instead of all `decoder_input_ids` of shape
-            `(batch_size, sequence_length)`.
        use_cache (`bool`, *optional*):
            If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
            `past_key_values`).
@@ -841,32 +838,26 @@ class BlipTextLMHeadModel(BlipTextPreTrainedModel):
        reduction="mean",
    ):
        r"""
-        encoder_hidden_states  (:
+        encoder_hidden_states (`torch.FloatTensor`, *optional*): Sequence of
-            obj:*torch.FloatTensor* of shape `(batch_size, sequence_length, hidden_size)`, *optional*): Sequence of
            hidden-states at the output of the last layer of the encoder. Used in the cross-attention if the model is
            configured as a decoder.
-        encoder_attention_mask (`torch.FloatTensor` of shape `(batch_size, sequence_length)`, *optional*):
+        encoder_attention_mask (`torch.FloatTensor`, *optional*):
            Mask to avoid performing attention on the padding token indices of the encoder input. This mask is used in
            the cross-attention if the model is configured as a decoder. Mask values selected in `[0, 1]`:
            - 1 for tokens that are **not masked**,
            - 0 for tokens that are **masked**.
-        labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*):
+        labels (`torch.LongTensor`, *optional*):
            Labels for computing the left-to-right language modeling loss (next word prediction). Indices should be in
            `[-100, 0, ..., config.vocab_size]` (see `input_ids` docstring) Tokens with indices set to `-100` are
            ignored (masked), the loss is only computed for the tokens with labels n `[0, ..., config.vocab_size]`
-        past_key_values (:
+        past_key_values (`tuple(tuple(torch.FloatTensor))`, *optional*):
-            obj:*tuple(tuple(torch.FloatTensor))* of length `config.n_layers` with each tuple having 4 tensors of shape
+            Contains precomputed key and value hidden states of the attention blocks. Can be used to speed up decoding.
-            `(batch_size, num_heads, sequence_length - 1, embed_size_per_head)`): Contains precomputed key and value
+            If `past_key_values` are used, the user can optionally input only the last `decoder_input_ids` (those that
-            hidden states of the attention blocks. Can be used to speed up decoding. If `past_key_values` are used, the
+            don't have their past key value states given to this model) of shape `(batch_size, 1)` instead of all
-            user can optionally input only the last `decoder_input_ids` (those that don't have their past key value
+            `decoder_input_ids` of shape `(batch_size, sequence_length)`.
-            states given to this model) of shape `(batch_size, 1)` instead of all `decoder_input_ids` of shape
-            `(batch_size, sequence_length)`.
        use_cache (`bool`, *optional*):
            If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see
            `past_key_values`).
-        Returns:
-        Example:
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
        if labels is not None: