[Generation] fix docs for decoder_input_ids (#5306)

* fix docs * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_tf_utils.py * Update src/transformers/modeling_tf_utils.py * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_tf_utils.py * Update src/transformers/modeling_utils.py

[Generation] fix docs for decoder_input_ids (#5306)
* fix docs * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_tf_utils.py * Update src/transformers/modeling_tf_utils.py * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_tf_utils.py * Update src/transformers/modeling_utils.py
08c9607c · Patrick von Platen · GitHub · 79a82cc0 · 08c9607c · 08c9607c
Unverified Commit 08c9607c authored Jun 26, 2020 by Patrick von Platen Committed by GitHub Jun 26, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 4 deletions

src/transformers/modeling_tf_utils.py src/transformers/modeling_tf_utils.py +3 -2

src/transformers/modeling_utils.py src/transformers/modeling_utils.py +3 -2

No files found.
--- a/src/transformers/modeling_tf_utils.py
+++ b/src/transformers/modeling_tf_utils.py
@@ -642,8 +642,9 @@ class TFPreTrainedModel(tf.keras.Model, TFModelUtilsMixin):
                `What are attention masks? <../glossary.html#attention-mask>`__
            decoder_start_token_id=None: (`optional`) int
-                If an encoder-decoder model starts decoding with a different token than BOS.
+                Start token id for the decoder. Defaults to ``decoder_start_token_id`` as defined the model's config or to the ``bos_token_id``
-                Defaults to `None` and is changed to `BOS` later.
+                if no ``decoder_start_token_id`` is found in the config.
+                This is only relevant for encoder-decoder models.
            use_cache: (`optional`) bool
                If `use_cache` is True, past key values are used to speed up decoding if applicable to model. Defaults to `True`.

--- a/src/transformers/modeling_utils.py
+++ b/src/transformers/modeling_utils.py
@@ -962,8 +962,9 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin):
                `What are attention masks? <../glossary.html#attention-mask>`__
            decoder_start_token_id=None: (`optional`) int
-                If an encoder-decoder model starts decoding with a different token than BOS.
+                Start token id for the decoder. Defaults to ``decoder_start_token_id`` as defined the model's config or to the ``bos_token_id``
-                Defaults to `None` and is changed to `BOS` later.
+                if no ``decoder_start_token_id`` is found in the config.
+                This is only relevant for encoder-decoder models.
            use_cache: (`optional`) bool
                If `use_cache` is True, past key values are used to speed up decoding if applicable to model. Defaults to `True`.