Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
896d7be9
Unverified
Commit
896d7be9
authored
Apr 13, 2021
by
Suraj Patil
Committed by
GitHub
Apr 13, 2021
Browse files
fix docstrings (#11221)
parent
823df939
Changes
12
Hide whitespace changes
Inline
Side-by-side
Showing
12 changed files
with
38 additions
and
54 deletions
+38
-54
src/transformers/models/blenderbot/modeling_blenderbot.py
src/transformers/models/blenderbot/modeling_blenderbot.py
+0
-4
src/transformers/models/blenderbot_small/modeling_blenderbot_small.py
...mers/models/blenderbot_small/modeling_blenderbot_small.py
+0
-4
src/transformers/models/fsmt/modeling_fsmt.py
src/transformers/models/fsmt/modeling_fsmt.py
+12
-5
src/transformers/models/m2m_100/modeling_m2m_100.py
src/transformers/models/m2m_100/modeling_m2m_100.py
+11
-6
src/transformers/models/marian/modeling_marian.py
src/transformers/models/marian/modeling_marian.py
+0
-4
src/transformers/models/mbart/modeling_mbart.py
src/transformers/models/mbart/modeling_mbart.py
+0
-7
src/transformers/models/pegasus/modeling_pegasus.py
src/transformers/models/pegasus/modeling_pegasus.py
+0
-4
src/transformers/models/pegasus/modeling_tf_pegasus.py
src/transformers/models/pegasus/modeling_tf_pegasus.py
+0
-4
src/transformers/models/prophetnet/modeling_prophetnet.py
src/transformers/models/prophetnet/modeling_prophetnet.py
+1
-5
src/transformers/models/speech_to_text/modeling_speech_to_text.py
...sformers/models/speech_to_text/modeling_speech_to_text.py
+11
-6
src/transformers/models/t5/modeling_t5.py
src/transformers/models/t5/modeling_t5.py
+2
-3
src/transformers/models/t5/modeling_tf_t5.py
src/transformers/models/t5/modeling_tf_t5.py
+1
-2
No files found.
src/transformers/models/blenderbot/modeling_blenderbot.py
View file @
896d7be9
...
@@ -554,10 +554,6 @@ BLENDERBOT_INPUTS_DOCSTRING = r"""
...
@@ -554,10 +554,6 @@ BLENDERBOT_INPUTS_DOCSTRING = r"""
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read :func:`modeling_blenderbot._prepare_decoder_inputs`
and modify to your needs. See diagram 1 in `the paper <https://arxiv.org/abs/1910.13461>`__ for more
information on the default strategy.
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
...
...
src/transformers/models/blenderbot_small/modeling_blenderbot_small.py
View file @
896d7be9
...
@@ -555,10 +555,6 @@ BLENDERBOT_SMALL_INPUTS_DOCSTRING = r"""
...
@@ -555,10 +555,6 @@ BLENDERBOT_SMALL_INPUTS_DOCSTRING = r"""
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read
:func:`modeling_blenderbot_small._prepare_decoder_inputs` and modify to your needs. See diagram 1 in `the
paper <https://arxiv.org/abs/1910.13461>`__ for more information on the default strategy.
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
...
...
src/transformers/models/fsmt/modeling_fsmt.py
View file @
896d7be9
...
@@ -234,13 +234,20 @@ FSMT_INPUTS_DOCSTRING = r"""
...
@@ -234,13 +234,20 @@ FSMT_INPUTS_DOCSTRING = r"""
`What are attention masks? <../glossary.html#attention-mask>`__
`What are attention masks? <../glossary.html#attention-mask>`__
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Provide for translation and summarization training. By default, the model will create this tensor by
Indices of decoder input sequence tokens in the vocabulary.
shifting the input_ids right, following the paper.
Indices can be obtained using :class:`~transformers.FSMTTokenizer`. See
:meth:`transformers.PreTrainedTokenizer.encode` and :meth:`transformers.PreTrainedTokenizer.__call__` for
details.
`What are input IDs? <../glossary.html#input-ids>`__
FSMT uses the :obj:`eos_token_id` as the starting token for :obj:`decoder_input_ids` generation. If
:obj:`past_key_values` is used, optionally only the last :obj:`decoder_input_ids` have to be input (see
:obj:`past_key_values`).
decoder_attention_mask (:obj:`torch.BoolTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.BoolTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default. If you want to change padding behavior, you should read
also be used by default.
:func:`modeling_fstm._prepare_fstm_decoder_inputs` and modify. See diagram 1 in the paper for more info on
the default strategy
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
...
...
src/transformers/models/m2m_100/modeling_m2m_100.py
View file @
896d7be9
...
@@ -589,15 +589,20 @@ M2M_100_INPUTS_DOCSTRING = r"""
...
@@ -589,15 +589,20 @@ M2M_100_INPUTS_DOCSTRING = r"""
`What are attention masks? <../glossary.html#attention-mask>`__
`What are attention masks? <../glossary.html#attention-mask>`__
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Provide for translation and summarization training. By default, the model will create this tensor by
Indices of decoder input sequence tokens in the vocabulary.
shifting the :obj:`input_ids` to the right, following the paper.
Indices can be obtained using :class:`~transformers.M2M100Tokenizer`. See
:meth:`transformers.PreTrainedTokenizer.encode` and :meth:`transformers.PreTrainedTokenizer.__call__` for
details.
`What are input IDs? <../glossary.html#input-ids>`__
M2M100 uses the :obj:`eos_token_id` as the starting token for :obj:`decoder_input_ids` generation. If
:obj:`past_key_values` is used, optionally only the last :obj:`decoder_input_ids` have to be input (see
:obj:`past_key_values`).
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read :func:`modeling_m2m_100._prepare_decoder_inputs`
and modify to your needs. See diagram 1 in `the paper <https://arxiv.org/abs/1910.13461>`__ for more
information on the default strategy.
encoder_outputs (:obj:`tuple(tuple(torch.FloatTensor)`, `optional`):
encoder_outputs (:obj:`tuple(tuple(torch.FloatTensor)`, `optional`):
Tuple consists of (:obj:`last_hidden_state`, `optional`: :obj:`hidden_states`, `optional`:
Tuple consists of (:obj:`last_hidden_state`, `optional`: :obj:`hidden_states`, `optional`:
:obj:`attentions`) :obj:`last_hidden_state` of shape :obj:`(batch_size, sequence_length, hidden_size)`,
:obj:`attentions`) :obj:`last_hidden_state` of shape :obj:`(batch_size, sequence_length, hidden_size)`,
...
...
src/transformers/models/marian/modeling_marian.py
View file @
896d7be9
...
@@ -567,10 +567,6 @@ MARIAN_INPUTS_DOCSTRING = r"""
...
@@ -567,10 +567,6 @@ MARIAN_INPUTS_DOCSTRING = r"""
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read :func:`modeling_marian._prepare_decoder_inputs` and
modify to your needs. See diagram 1 in `the paper <https://arxiv.org/abs/1910.13461>`__ for more
information on the default strategy.
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
...
...
src/transformers/models/mbart/modeling_mbart.py
View file @
896d7be9
...
@@ -575,9 +575,6 @@ MBART_INPUTS_DOCSTRING = r"""
...
@@ -575,9 +575,6 @@ MBART_INPUTS_DOCSTRING = r"""
- 0 for tokens that are **masked**.
- 0 for tokens that are **masked**.
`What are attention masks? <../glossary.html#attention-mask>`__
`What are attention masks? <../glossary.html#attention-mask>`__
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Provide for translation and summarization training. By default, the model will create this tensor by
shifting the :obj:`input_ids` to the right, following the paper.
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Indices of decoder input sequence tokens in the vocabulary.
Indices of decoder input sequence tokens in the vocabulary.
...
@@ -598,10 +595,6 @@ MBART_INPUTS_DOCSTRING = r"""
...
@@ -598,10 +595,6 @@ MBART_INPUTS_DOCSTRING = r"""
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read :func:`modeling_mbart._prepare_decoder_inputs` and
modify to your needs. See diagram 1 in `the paper <https://arxiv.org/abs/1910.13461>`__ for more
information on the default strategy.
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
...
...
src/transformers/models/pegasus/modeling_pegasus.py
View file @
896d7be9
...
@@ -566,10 +566,6 @@ PEGASUS_INPUTS_DOCSTRING = r"""
...
@@ -566,10 +566,6 @@ PEGASUS_INPUTS_DOCSTRING = r"""
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read :func:`modeling_pegasus._prepare_decoder_inputs`
and modify to your needs. See diagram 1 in `the paper <https://arxiv.org/abs/1910.13461>`__ for more
information on the default strategy.
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
...
...
src/transformers/models/pegasus/modeling_tf_pegasus.py
View file @
896d7be9
...
@@ -597,10 +597,6 @@ PEGASUS_INPUTS_DOCSTRING = r"""
...
@@ -597,10 +597,6 @@ PEGASUS_INPUTS_DOCSTRING = r"""
Pegasus uses the :obj:`pad_token_id` as the starting token for :obj:`decoder_input_ids` generation. If
Pegasus uses the :obj:`pad_token_id` as the starting token for :obj:`decoder_input_ids` generation. If
:obj:`past_key_values` is used, optionally only the last :obj:`decoder_input_ids` have to be input (see
:obj:`past_key_values` is used, optionally only the last :obj:`decoder_input_ids` have to be input (see
:obj:`past_key_values`).
:obj:`past_key_values`).
For translation and summarization training, :obj:`decoder_input_ids` should be provided. If no
:obj:`decoder_input_ids` is provided, the model will create this tensor by shifting the :obj:`input_ids` to
the right for denoising pre-training following the paper.
decoder_attention_mask (:obj:`tf.Tensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`tf.Tensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
will be made by default and ignore pad tokens. It is not recommended to set this for most use cases.
will be made by default and ignore pad tokens. It is not recommended to set this for most use cases.
head_mask (:obj:`tf.Tensor` of shape :obj:`(encoder_layers, encoder_attention_heads)`, `optional`):
head_mask (:obj:`tf.Tensor` of shape :obj:`(encoder_layers, encoder_attention_heads)`, `optional`):
...
...
src/transformers/models/prophetnet/modeling_prophetnet.py
View file @
896d7be9
...
@@ -91,7 +91,7 @@ PROPHETNET_INPUTS_DOCSTRING = r"""
...
@@ -91,7 +91,7 @@ PROPHETNET_INPUTS_DOCSTRING = r"""
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Indices of decoder input sequence tokens in the vocabulary.
Indices of decoder input sequence tokens in the vocabulary.
Indices can be obtained using :class:`~transformers.Pr
eTrained
Tokenizer`. See
Indices can be obtained using :class:`~transformers.Pr
ophetNet
Tokenizer`. See
:meth:`transformers.PreTrainedTokenizer.encode` and :meth:`transformers.PreTrainedTokenizer.__call__` for
:meth:`transformers.PreTrainedTokenizer.encode` and :meth:`transformers.PreTrainedTokenizer.__call__` for
details.
details.
...
@@ -104,10 +104,6 @@ PROPHETNET_INPUTS_DOCSTRING = r"""
...
@@ -104,10 +104,6 @@ PROPHETNET_INPUTS_DOCSTRING = r"""
decoder_attention_mask (:obj:`torch.BoolTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.BoolTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read :func:`modeling_bart._prepare_decoder_inputs` and
modify to your needs. See diagram 1 in `the paper <https://arxiv.org/abs/1910.13461>`__ for more
information on the default strategy.
encoder_outputs (:obj:`tuple(tuple(torch.FloatTensor)`, `optional`):
encoder_outputs (:obj:`tuple(tuple(torch.FloatTensor)`, `optional`):
Tuple consists of (:obj:`last_hidden_state`, `optional`: :obj:`hidden_states`, `optional`:
Tuple consists of (:obj:`last_hidden_state`, `optional`: :obj:`hidden_states`, `optional`:
:obj:`attentions`) :obj:`last_hidden_state` of shape :obj:`(batch_size, sequence_length, hidden_size)`,
:obj:`attentions`) :obj:`last_hidden_state` of shape :obj:`(batch_size, sequence_length, hidden_size)`,
...
...
src/transformers/models/speech_to_text/modeling_speech_to_text.py
View file @
896d7be9
...
@@ -610,15 +610,20 @@ SPEECH_TO_TEXT_INPUTS_DOCSTRING = r"""
...
@@ -610,15 +610,20 @@ SPEECH_TO_TEXT_INPUTS_DOCSTRING = r"""
`What are attention masks? <../glossary.html#attention-mask>`__
`What are attention masks? <../glossary.html#attention-mask>`__
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Provide for translation and summarization training. By default, the model will create this tensor by
Indices of decoder input sequence tokens in the vocabulary.
shifting the :obj:`input_ids` to the right, following the paper.
Indices can be obtained using :class:`~transformers.SpeechToTextTokenizer`. See
:meth:`transformers.PreTrainedTokenizer.encode` and :meth:`transformers.PreTrainedTokenizer.__call__` for
details.
`What are input IDs? <../glossary.html#input-ids>`__
SpeechToText uses the :obj:`eos_token_id` as the starting token for :obj:`decoder_input_ids` generation. If
:obj:`past_key_values` is used, optionally only the last :obj:`decoder_input_ids` have to be input (see
:obj:`past_key_values`).
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
If you want to change padding behavior, you should read
:func:`modeling_speech_to_text._prepare_decoder_inputs` and modify to your needs. See diagram 1 in `the
paper <https://arxiv.org/abs/1910.13461>`__ for more information on the default strategy.
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.Tensor` of shape :obj:`(num_layers, num_heads)`, `optional`):
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
Mask to nullify selected heads of the attention modules in the encoder. Mask values selected in ``[0, 1]``:
...
...
src/transformers/models/t5/modeling_t5.py
View file @
896d7be9
...
@@ -1057,7 +1057,7 @@ T5_INPUTS_DOCSTRING = r"""
...
@@ -1057,7 +1057,7 @@ T5_INPUTS_DOCSTRING = r"""
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Indices of decoder input sequence tokens in the vocabulary.
Indices of decoder input sequence tokens in the vocabulary.
Indices can be obtained using :class:`~transformers.
Bart
Tokenizer`. See
Indices can be obtained using :class:`~transformers.
T5
Tokenizer`. See
:meth:`transformers.PreTrainedTokenizer.encode` and :meth:`transformers.PreTrainedTokenizer.__call__` for
:meth:`transformers.PreTrainedTokenizer.encode` and :meth:`transformers.PreTrainedTokenizer.__call__` for
details.
details.
...
@@ -1068,8 +1068,7 @@ T5_INPUTS_DOCSTRING = r"""
...
@@ -1068,8 +1068,7 @@ T5_INPUTS_DOCSTRING = r"""
:obj:`past_key_values`).
:obj:`past_key_values`).
To know more on how to prepare :obj:`decoder_input_ids` for pretraining take a look at `T5 Training
To know more on how to prepare :obj:`decoder_input_ids` for pretraining take a look at `T5 Training
<./t5.html#training>`__. If :obj:`decoder_input_ids` and :obj:`decoder_inputs_embeds` are both unset,
<./t5.html#training>`__.
:obj:`decoder_input_ids` takes the value of :obj:`input_ids`.
decoder_attention_mask (:obj:`torch.BoolTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
decoder_attention_mask (:obj:`torch.BoolTensor` of shape :obj:`(batch_size, target_sequence_length)`, `optional`):
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
Default behavior: generate a tensor that ignores pad tokens in :obj:`decoder_input_ids`. Causal mask will
also be used by default.
also be used by default.
...
...
src/transformers/models/t5/modeling_tf_t5.py
View file @
896d7be9
...
@@ -954,8 +954,7 @@ T5_INPUTS_DOCSTRING = r"""
...
@@ -954,8 +954,7 @@ T5_INPUTS_DOCSTRING = r"""
:obj:`decoder_input_ids` have to be input (see :obj:`past_key_values`).
:obj:`decoder_input_ids` have to be input (see :obj:`past_key_values`).
To know more on how to prepare :obj:`decoder_input_ids` for pretraining take a look at `T5 Training
To know more on how to prepare :obj:`decoder_input_ids` for pretraining take a look at `T5 Training
<./t5.html#training>`__. If :obj:`decoder_input_ids` and :obj:`decoder_inputs_embeds` are both unset,
<./t5.html#training>`__.
:obj:`decoder_input_ids` takes the value of :obj:`input_ids`.
attention_mask (:obj:`tf.Tensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
attention_mask (:obj:`tf.Tensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
Mask to avoid performing attention on padding token indices. Mask values selected in ``[0, 1]``:
Mask to avoid performing attention on padding token indices. Mask values selected in ``[0, 1]``:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment