Unverified Commit d406a272 authored by Stas Bekman's avatar Stas Bekman Committed by GitHub
Browse files

[docs] fix xref to `PreTrainedModel.generate` (#11049)

* fix xref to generate

* do the same for search methods

* style

* style
parent 123b597f
...@@ -13,19 +13,21 @@ ...@@ -13,19 +13,21 @@
Utilities for Generation Utilities for Generation
----------------------------------------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
This page lists all the utility functions used by :meth:`~transformers.PreTrainedModel.generate`, This page lists all the utility functions used by :meth:`~transformers.generation_utils.GenerationMixin.generate`,
:meth:`~transformers.PreTrainedModel.greedy_search`, :meth:`~transformers.PreTrainedModel.sample`, :meth:`~transformers.generation_utils.GenerationMixin.greedy_search`,
:meth:`~transformers.PreTrainedModel.beam_search`, :meth:`~transformers.PreTrainedModel.beam_sample`, and :meth:`~transformers.generation_utils.GenerationMixin.sample`,
:meth:`~transformers.PreTrainedModel.group_beam_search`. :meth:`~transformers.generation_utils.GenerationMixin.beam_search`,
:meth:`~transformers.generation_utils.GenerationMixin.beam_sample`, and
:meth:`~transformers.generation_utils.GenerationMixin.group_beam_search`.
Most of those are only useful if you are studying the code of the generate methods in the library. Most of those are only useful if you are studying the code of the generate methods in the library.
Generate Outputs Generate Outputs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The output of :meth:`~transformers.PreTrainedModel.generate` is an instance of a subclass of The output of :meth:`~transformers.generation_utils.GenerationMixin.generate` is an instance of a subclass of
:class:`~transformers.file_utils.ModelOutput`. This output is a data structure containing all the information returned :class:`~transformers.file_utils.ModelOutput`. This output is a data structure containing all the information returned
by :meth:`~transformers.PreTrainedModel.generate`, but that can also be used as tuple or dictionary. by :meth:`~transformers.generation_utils.GenerationMixin.generate`, but that can also be used as tuple or dictionary.
Here's an example: Here's an example:
......
...@@ -61,7 +61,7 @@ Implementation Notes ...@@ -61,7 +61,7 @@ Implementation Notes
- Model predictions are intended to be identical to the original implementation when - Model predictions are intended to be identical to the original implementation when
:obj:`force_bos_token_to_be_generated=True`. This only works, however, if the string you pass to :obj:`force_bos_token_to_be_generated=True`. This only works, however, if the string you pass to
:func:`fairseq.encode` starts with a space. :func:`fairseq.encode` starts with a space.
- :meth:`~transformers.BartForConditionalGeneration.generate` should be used for conditional generation tasks like - :meth:`~transformers.generation_utils.GenerationMixin.generate` should be used for conditional generation tasks like
summarization, see the example in that docstrings. summarization, see the example in that docstrings.
- Models that load the `facebook/bart-large-cnn` weights will not have a :obj:`mask_token_id`, or be able to perform - Models that load the `facebook/bart-large-cnn` weights will not have a :obj:`mask_token_id`, or be able to perform
mask-filling tasks. mask-filling tasks.
......
...@@ -44,9 +44,9 @@ Tips: ...@@ -44,9 +44,9 @@ Tips:
For more information about which prefix to use, it is easiest to look into Appendix D of the `paper For more information about which prefix to use, it is easiest to look into Appendix D of the `paper
<https://arxiv.org/pdf/1910.10683.pdf>`__. - For sequence-to-sequence generation, it is recommended to use <https://arxiv.org/pdf/1910.10683.pdf>`__. - For sequence-to-sequence generation, it is recommended to use
:obj:`T5ForConditionalGeneration.generate()`. This method takes care of feeding the encoded input via cross-attention :meth:`~transformers.generation_utils.GenerationMixin.generate`. This method takes care of feeding the encoded input
layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative scalar embeddings. via cross-attention layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative
Encoder input padding can be done on the left and on the right. scalar embeddings. Encoder input padding can be done on the left and on the right.
This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
<https://github.com/google-research/text-to-text-transfer-transformer>`__. <https://github.com/google-research/text-to-text-transfer-transformer>`__.
......
...@@ -505,8 +505,8 @@ This outputs a (hopefully) coherent next token following the original sequence, ...@@ -505,8 +505,8 @@ This outputs a (hopefully) coherent next token following the original sequence,
>>> print(resulting_string) >>> print(resulting_string)
Hugging Face is based in DUMBO, New York City, and has Hugging Face is based in DUMBO, New York City, and has
In the next section, we show how :func:`~transformers.PreTrainedModel.generate` can be used to generate multiple tokens In the next section, we show how :func:`~transformers.generation_utils.GenerationMixin.generate` can be used to
up to a specified length instead of one token at a time. generate multiple tokens up to a specified length instead of one token at a time.
Text Generation Text Generation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
...@@ -906,8 +906,9 @@ class RagSequenceForGeneration(RagPreTrainedModel): ...@@ -906,8 +906,9 @@ class RagSequenceForGeneration(RagPreTrainedModel):
**model_kwargs **model_kwargs
): ):
""" """
Implements RAG sequence "thorough" decoding. Read the :meth:`~transformers.PreTrainedModel.generate`` Implements RAG sequence "thorough" decoding. Read the
documentation for more information on how to set other generate input parameters. :meth:`~transformers.generation_utils.GenerationMixin.generate`` documentation for more information on how to
set other generate input parameters.
Args: Args:
input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`, `optional`): input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
...@@ -942,14 +943,15 @@ class RagSequenceForGeneration(RagPreTrainedModel): ...@@ -942,14 +943,15 @@ class RagSequenceForGeneration(RagPreTrainedModel):
to be set to :obj:`False` if used while training with distributed backend. to be set to :obj:`False` if used while training with distributed backend.
num_return_sequences(:obj:`int`, `optional`, defaults to 1): num_return_sequences(:obj:`int`, `optional`, defaults to 1):
The number of independently computed returned sequences for each element in the batch. Note that this The number of independently computed returned sequences for each element in the batch. Note that this
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate`` is not the value we pass to the ``generator``'s
function, where we set ``num_return_sequences`` to :obj:`num_beams`. `:func:`~transformers.generation_utils.GenerationMixin.generate`` function, where we set
``num_return_sequences`` to :obj:`num_beams`.
num_beams (:obj:`int`, `optional`, defaults to 1): num_beams (:obj:`int`, `optional`, defaults to 1):
Number of beams for beam search. 1 means no beam search. Number of beams for beam search. 1 means no beam search.
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`) n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
Number of documents to retrieve and/or number of documents for which to generate an answer. Number of documents to retrieve and/or number of documents for which to generate an answer.
kwargs: kwargs:
Additional kwargs will be passed to :meth:`~transformers.PreTrainedModel.generate`. Additional kwargs will be passed to :meth:`~transformers.generation_utils.GenerationMixin.generate`.
Return: Return:
:obj:`torch.LongTensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated :obj:`torch.LongTensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated
...@@ -1452,8 +1454,9 @@ class RagTokenForGeneration(RagPreTrainedModel): ...@@ -1452,8 +1454,9 @@ class RagTokenForGeneration(RagPreTrainedModel):
enabled. enabled.
num_return_sequences(:obj:`int`, `optional`, defaults to 1): num_return_sequences(:obj:`int`, `optional`, defaults to 1):
The number of independently computed returned sequences for each element in the batch. Note that this The number of independently computed returned sequences for each element in the batch. Note that this
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate` is not the value we pass to the ``generator``'s
function, where we set ``num_return_sequences`` to :obj:`num_beams`. `:func:`~transformers.generation_utils.GenerationMixin.generate` function, where we set
``num_return_sequences`` to :obj:`num_beams`.
decoder_start_token_id (:obj:`int`, `optional`): decoder_start_token_id (:obj:`int`, `optional`):
If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token. If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token.
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`) n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
......
...@@ -1130,8 +1130,9 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss ...@@ -1130,8 +1130,9 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss
Number of beams for beam search. 1 means no beam search. Number of beams for beam search. 1 means no beam search.
num_return_sequences(:obj:`int`, `optional`, defaults to 1): num_return_sequences(:obj:`int`, `optional`, defaults to 1):
The number of independently computed returned sequences for each element in the batch. Note that this The number of independently computed returned sequences for each element in the batch. Note that this
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate` is not the value we pass to the ``generator``'s
function, where we set ``num_return_sequences`` to :obj:`num_beams`. `:func:`~transformers.generation_utils.GenerationMixin.generate` function, where we set
``num_return_sequences`` to :obj:`num_beams`.
decoder_start_token_id (:obj:`int`, `optional`): decoder_start_token_id (:obj:`int`, `optional`):
If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token. If an encoder-decoder model starts decoding with a different token than `bos`, the id of that token.
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`) n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
...@@ -1682,8 +1683,9 @@ class TFRagSequenceForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingL ...@@ -1682,8 +1683,9 @@ class TFRagSequenceForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingL
**model_kwargs **model_kwargs
): ):
""" """
Implements RAG sequence "thorough" decoding. Read the :meth:`~transformers.PreTrainedModel.generate`` Implements RAG sequence "thorough" decoding. Read the
documentation for more information on how to set other generate input parameters :meth:`~transformers.generation_utils.GenerationMixin.generate`` documentation for more information on how to
set other generate input parameters
Args: Args:
input_ids (:obj:`tf.Tensor` of shape :obj:`(batch_size, sequence_length)`, `optional`): input_ids (:obj:`tf.Tensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
...@@ -1711,14 +1713,15 @@ class TFRagSequenceForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingL ...@@ -1711,14 +1713,15 @@ class TFRagSequenceForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingL
to be set to :obj:`False` if used while training with distributed backend. to be set to :obj:`False` if used while training with distributed backend.
num_return_sequences(:obj:`int`, `optional`, defaults to 1): num_return_sequences(:obj:`int`, `optional`, defaults to 1):
The number of independently computed returned sequences for each element in the batch. Note that this The number of independently computed returned sequences for each element in the batch. Note that this
is not the value we pass to the ``generator``'s `:func:`~transformers.PreTrainedModel.generate`` is not the value we pass to the ``generator``'s
function, where we set ``num_return_sequences`` to :obj:`num_beams`. `:func:`~transformers.generation_utils.GenerationMixin.generate`` function, where we set
``num_return_sequences`` to :obj:`num_beams`.
num_beams (:obj:`int`, `optional`, defaults to 1): num_beams (:obj:`int`, `optional`, defaults to 1):
Number of beams for beam search. 1 means no beam search. Number of beams for beam search. 1 means no beam search.
n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`) n_docs (:obj:`int`, `optional`, defaults to :obj:`config.n_docs`)
Number of documents to retrieve and/or number of documents for which to generate an answer. Number of documents to retrieve and/or number of documents for which to generate an answer.
kwargs: kwargs:
Additional kwargs will be passed to :meth:`~transformers.PreTrainedModel.generate` Additional kwargs will be passed to :meth:`~transformers.generation_utils.GenerationMixin.generate`
Return: Return:
:obj:`tf.Tensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated :obj:`tf.Tensor` of shape :obj:`(batch_size * num_return_sequences, sequence_length)`: The generated
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment