Unverified Commit 133c5e40 authored by Stas Bekman's avatar Stas Bekman Committed by GitHub
Browse files

[doc] consistent True/False/None default format (#14951)



* [doc] consistent True/False/None default format

* Update src/transformers/models/xlnet/modeling_xlnet.py
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
parent b2f50025
...@@ -57,13 +57,13 @@ Tips: ...@@ -57,13 +57,13 @@ Tips:
important preprocessing step is that images and segmentation maps are randomly cropped and padded to the same size, important preprocessing step is that images and segmentation maps are randomly cropped and padded to the same size,
such as 512x512 or 640x640, after which they are normalized. such as 512x512 or 640x640, after which they are normalized.
- One additional thing to keep in mind is that one can initialize [`SegformerFeatureExtractor`] with - One additional thing to keep in mind is that one can initialize [`SegformerFeatureExtractor`] with
`reduce_labels` set to *True* or *False*. In some datasets (like ADE20k), the 0 index is used in the annotated `reduce_labels` set to `True` or `False`. In some datasets (like ADE20k), the 0 index is used in the annotated
segmentation maps for background. However, ADE20k doesn't include the "background" class in its 150 labels. segmentation maps for background. However, ADE20k doesn't include the "background" class in its 150 labels.
Therefore, `reduce_labels` is used to reduce all labels by 1, and to make sure no loss is computed for the Therefore, `reduce_labels` is used to reduce all labels by 1, and to make sure no loss is computed for the
background class (i.e. it replaces 0 in the annotated maps by 255, which is the *ignore_index* of the loss function background class (i.e. it replaces 0 in the annotated maps by 255, which is the *ignore_index* of the loss function
used by [`SegformerForSemanticSegmentation`]). However, other datasets use the 0 index as used by [`SegformerForSemanticSegmentation`]). However, other datasets use the 0 index as
background class and include this class as part of all labels. In that case, `reduce_labels` should be set to background class and include this class as part of all labels. In that case, `reduce_labels` should be set to
*False*, as loss should also be computed for the background class. `False`, as loss should also be computed for the background class.
- As most models, SegFormer comes in different sizes, the details of which can be found in the table below. - As most models, SegFormer comes in different sizes, the details of which can be found in the table below.
| **Model variant** | **Depths** | **Hidden sizes** | **Decoder hidden size** | **Params (M)** | **ImageNet-1k Top 1** | | **Model variant** | **Depths** | **Hidden sizes** | **Decoder hidden size** | **Params (M)** | **ImageNet-1k Top 1** |
......
...@@ -446,15 +446,15 @@ class TFGenerationMixin: ...@@ -446,15 +446,15 @@ class TFGenerationMixin:
use_cache: (`bool`, *optional*, defaults to `True`): use_cache: (`bool`, *optional*, defaults to `True`):
Whether or not the model should use the past last key/values attentions (if applicable to the model) to Whether or not the model should use the past last key/values attentions (if applicable to the model) to
speed up decoding. speed up decoding.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
forced_bos_token_id (`int`, *optional*): forced_bos_token_id (`int`, *optional*):
The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful
......
...@@ -878,15 +878,15 @@ class GenerationMixin: ...@@ -878,15 +878,15 @@ class GenerationMixin:
Custom stopping criteria that complement the default stopping criteria built from arguments and a Custom stopping criteria that complement the default stopping criteria built from arguments and a
model's config. If a stopping criteria is passed that is already created with the arguments or a model's config. If a stopping criteria is passed that is already created with the arguments or a
model's config an error is thrown. This feature is intended for advanced users. model's config an error is thrown. This feature is intended for advanced users.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
forced_bos_token_id (`int`, *optional*): forced_bos_token_id (`int`, *optional*):
The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful
...@@ -1302,15 +1302,15 @@ class GenerationMixin: ...@@ -1302,15 +1302,15 @@ class GenerationMixin:
The id of the *padding* token. The id of the *padding* token.
eos_token_id (`int`, *optional*): eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token. The id of the *end-of-sequence* token.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
...@@ -1529,15 +1529,15 @@ class GenerationMixin: ...@@ -1529,15 +1529,15 @@ class GenerationMixin:
The id of the *padding* token. The id of the *padding* token.
eos_token_id (`int`, *optional*): eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token. The id of the *end-of-sequence* token.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
...@@ -1767,15 +1767,15 @@ class GenerationMixin: ...@@ -1767,15 +1767,15 @@ class GenerationMixin:
The id of the *padding* token. The id of the *padding* token.
eos_token_id (`int`, *optional*): eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token. The id of the *end-of-sequence* token.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
...@@ -2061,15 +2061,15 @@ class GenerationMixin: ...@@ -2061,15 +2061,15 @@ class GenerationMixin:
The id of the *padding* token. The id of the *padding* token.
eos_token_id (`int`, *optional*): eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token. The id of the *end-of-sequence* token.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
...@@ -2356,15 +2356,15 @@ class GenerationMixin: ...@@ -2356,15 +2356,15 @@ class GenerationMixin:
The id of the *padding* token. The id of the *padding* token.
eos_token_id (`int`, *optional*): eos_token_id (`int`, *optional*):
The id of the *end-of-sequence* token. The id of the *end-of-sequence* token.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
......
...@@ -707,7 +707,7 @@ class MLflowCallback(TrainerCallback): ...@@ -707,7 +707,7 @@ class MLflowCallback(TrainerCallback):
HF_MLFLOW_LOG_ARTIFACTS (`str`, *optional*): HF_MLFLOW_LOG_ARTIFACTS (`str`, *optional*):
Whether to use MLflow .log_artifact() facility to log artifacts. Whether to use MLflow .log_artifact() facility to log artifacts.
This only makes sense if logging to a remote server, e.g. s3 or GCS. If set to *True* or *1*, will copy This only makes sense if logging to a remote server, e.g. s3 or GCS. If set to `True` or *1*, will copy
whatever is in [`TrainingArguments`]'s `output_dir` to the local or remote artifact storage. Using it whatever is in [`TrainingArguments`]'s `output_dir` to the local or remote artifact storage. Using it
without a remote storage will just copy the files to your artifact location. without a remote storage will just copy the files to your artifact location.
""" """
......
...@@ -1840,7 +1840,7 @@ class PoolerEndLogits(nn.Module): ...@@ -1840,7 +1840,7 @@ class PoolerEndLogits(nn.Module):
<Tip> <Tip>
One of `start_states` or `start_positions` should be not obj:*None*. If both are set, `start_positions` One of `start_states` or `start_positions` should be not obj:`None`. If both are set, `start_positions`
overrides `start_states`. overrides `start_states`.
</Tip> </Tip>
...@@ -1906,7 +1906,7 @@ class PoolerAnswerClass(nn.Module): ...@@ -1906,7 +1906,7 @@ class PoolerAnswerClass(nn.Module):
<Tip> <Tip>
One of `start_states` or `start_positions` should be not obj:*None*. If both are set, `start_positions` One of `start_states` or `start_positions` should be not obj:`None`. If both are set, `start_positions`
overrides `start_states`. overrides `start_states`.
</Tip> </Tip>
......
...@@ -219,7 +219,7 @@ class MecabTokenizer: ...@@ -219,7 +219,7 @@ class MecabTokenizer:
Whether to apply unicode normalization to text before tokenization. Whether to apply unicode normalization to text before tokenization.
**mecab_dic**: (*optional*) string (default "ipadic") **mecab_dic**: (*optional*) string (default "ipadic")
Name of dictionary to be used for MeCab initialization. If you are using a system-installed dictionary, Name of dictionary to be used for MeCab initialization. If you are using a system-installed dictionary,
set this option to *None* and modify *mecab_option*. set this option to `None` and modify *mecab_option*.
**mecab_option**: (*optional*) string **mecab_option**: (*optional*) string
String passed to MeCab constructor. String passed to MeCab constructor.
""" """
......
...@@ -632,7 +632,7 @@ def _replace_html_entities(text, keep=(), remove_illegal=True, encoding="utf-8") ...@@ -632,7 +632,7 @@ def _replace_html_entities(text, keep=(), remove_illegal=True, encoding="utf-8")
List of entity names which should not be replaced. This supports both numeric entities (`&#nnnn;` and List of entity names which should not be replaced. This supports both numeric entities (`&#nnnn;` and
`&#hhhh;`) and named entities (such as `&nbsp;` or `&gt;`). `&#hhhh;`) and named entities (such as `&nbsp;` or `&gt;`).
remove_illegal (bool): remove_illegal (bool):
If *True*, entities that can't be converted are removed. Otherwise, entities that can't be converted are If `True`, entities that can't be converted are removed. Otherwise, entities that can't be converted are
kept "as is". kept "as is".
Returns: A unicode string with the entities removed. Returns: A unicode string with the entities removed.
......
...@@ -150,7 +150,7 @@ class DetrObjectDetectionOutput(ModelOutput): ...@@ -150,7 +150,7 @@ class DetrObjectDetectionOutput(ModelOutput):
possible padding). You can use [`~DetrFeatureExtractor.post_process`] to retrieve the unnormalized bounding possible padding). You can use [`~DetrFeatureExtractor.post_process`] to retrieve the unnormalized bounding
boxes. boxes.
auxiliary_outputs (`list[Dict]`, *optional*): auxiliary_outputs (`list[Dict]`, *optional*):
Optional, only returned when auxilary losses are activated (i.e. `config.auxiliary_loss` is set to *True*) Optional, only returned when auxilary losses are activated (i.e. `config.auxiliary_loss` is set to `True`)
and labels are provided. It is a list of dictionaries containing the two above keys (`logits` and and labels are provided. It is a list of dictionaries containing the two above keys (`logits` and
`pred_boxes`) for each decoder layer. `pred_boxes`) for each decoder layer.
last_hidden_state (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*): last_hidden_state (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
...@@ -217,7 +217,7 @@ class DetrSegmentationOutput(ModelOutput): ...@@ -217,7 +217,7 @@ class DetrSegmentationOutput(ModelOutput):
[`~DetrFeatureExtractor.post_process_panoptic`] to evaluate instance and panoptic segmentation masks [`~DetrFeatureExtractor.post_process_panoptic`] to evaluate instance and panoptic segmentation masks
respectively. respectively.
auxiliary_outputs (`list[Dict]`, *optional*): auxiliary_outputs (`list[Dict]`, *optional*):
Optional, only returned when auxiliary losses are activated (i.e. `config.auxiliary_loss` is set to *True*) Optional, only returned when auxiliary losses are activated (i.e. `config.auxiliary_loss` is set to `True`)
and labels are provided. It is a list of dictionaries containing the two above keys (`logits` and and labels are provided. It is a list of dictionaries containing the two above keys (`logits` and
`pred_boxes`) for each decoder layer. `pred_boxes`) for each decoder layer.
last_hidden_state (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*): last_hidden_state (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*):
......
...@@ -306,7 +306,7 @@ class EncoderDecoderModel(PreTrainedModel): ...@@ -306,7 +306,7 @@ class EncoderDecoderModel(PreTrainedModel):
`config` argument. This loading path is slower than converting the TensorFlow checkpoint in a `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a
PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards. PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
decoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to *None*): decoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to `None`):
Information necessary to initiate the decoder. Can be either: Information necessary to initiate the decoder. Can be either:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
......
...@@ -755,7 +755,7 @@ class FlaxEncoderDecoderModel(FlaxPreTrainedModel): ...@@ -755,7 +755,7 @@ class FlaxEncoderDecoderModel(FlaxPreTrainedModel):
- A path to a *directory* containing model weights saved using - A path to a *directory* containing model weights saved using
[`~FlaxPreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`. [`~FlaxPreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`.
decoder_pretrained_model_name_or_path (:obj: *Union[str, os.PathLike]*, *optional*, defaults to *None*): decoder_pretrained_model_name_or_path (:obj: *Union[str, os.PathLike]*, *optional*, defaults to `None`):
Information necessary to initiate the decoder. Can be either: Information necessary to initiate the decoder. Can be either:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
......
...@@ -319,7 +319,7 @@ class TFEncoderDecoderModel(TFPreTrainedModel): ...@@ -319,7 +319,7 @@ class TFEncoderDecoderModel(TFPreTrainedModel):
- A path or url to a *pytorch index checkpoint file* (e.g, `./pt_model/`). In this case, - A path or url to a *pytorch index checkpoint file* (e.g, `./pt_model/`). In this case,
`encoder_from_pt` should be set to `True`. `encoder_from_pt` should be set to `True`.
decoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to *None*): decoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to `None`):
Information necessary to initiate the decoder. Can be either: Information necessary to initiate the decoder. Can be either:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
......
...@@ -888,8 +888,8 @@ class LayoutLMv2Tokenizer(PreTrainedTokenizer): ...@@ -888,8 +888,8 @@ class LayoutLMv2Tokenizer(PreTrainedTokenizer):
""" """
Prepares a sequence or a pair of sequences so that it can be used by the model. It adds special tokens, Prepares a sequence or a pair of sequences so that it can be used by the model. It adds special tokens,
truncates sequences if overflowing while taking into account the special tokens and manages a moving window truncates sequences if overflowing while taking into account the special tokens and manages a moving window
(with user defined stride) for overflowing tokens. Please Note, for *text_pair* different than *None* and (with user defined stride) for overflowing tokens. Please Note, for *text_pair* different than `None` and
*truncation_strategy = longest_first* or *True*, it is not possible to return overflowing tokens. Such a *truncation_strategy = longest_first* or `True`, it is not possible to return overflowing tokens. Such a
combination of arguments will raise an error. combination of arguments will raise an error.
Word-level `boxes` are turned into token-level `bbox`. If provided, word-level `word_labels` are turned into Word-level `boxes` are turned into token-level `bbox`. If provided, word-level `word_labels` are turned into
......
...@@ -879,8 +879,8 @@ class LukeTokenizer(RobertaTokenizer): ...@@ -879,8 +879,8 @@ class LukeTokenizer(RobertaTokenizer):
Prepares a sequence of input id, entity id and entity span, or a pair of sequences of inputs ids, entity ids, Prepares a sequence of input id, entity id and entity span, or a pair of sequences of inputs ids, entity ids,
entity spans so that it can be used by the model. It adds special tokens, truncates sequences if overflowing entity spans so that it can be used by the model. It adds special tokens, truncates sequences if overflowing
while taking into account the special tokens and manages a moving window (with user defined stride) for while taking into account the special tokens and manages a moving window (with user defined stride) for
overflowing tokens. Please Note, for *pair_ids* different than *None* and *truncation_strategy = longest_first* overflowing tokens. Please Note, for *pair_ids* different than `None` and *truncation_strategy = longest_first*
or *True*, it is not possible to return overflowing tokens. Such a combination of arguments will raise an or `True`, it is not possible to return overflowing tokens. Such a combination of arguments will raise an
error. error.
Args: Args:
......
...@@ -1324,7 +1324,7 @@ class TFLxmertForPreTraining(TFLxmertPreTrainedModel): ...@@ -1324,7 +1324,7 @@ class TFLxmertForPreTraining(TFLxmertPreTrainedModel):
Labels for computing the masked language modeling loss. Indices should be in `[-100, 0, ..., Labels for computing the masked language modeling loss. Indices should be in `[-100, 0, ...,
config.vocab_size]` (see `input_ids` docstring) Tokens with indices set to `-100` are ignored (masked), the config.vocab_size]` (see `input_ids` docstring) Tokens with indices set to `-100` are ignored (masked), the
loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]` loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`
obj_labels: (`Dict[Str: Tuple[tf.Tensor, tf.Tensor]]`, *optional*, defaults to :obj: *None*): obj_labels: (`Dict[Str: Tuple[tf.Tensor, tf.Tensor]]`, *optional*, defaults to :obj: `None`):
each key is named after each one of the visual losses and each element of the tuple is of the shape each key is named after each one of the visual losses and each element of the tuple is of the shape
`(batch_size, num_features)` and `(batch_size, num_features, visual_feature_dim)` for each the label id and `(batch_size, num_features)` and `(batch_size, num_features, visual_feature_dim)` for each the label id and
the label score respectively the label score respectively
...@@ -1334,7 +1334,7 @@ class TFLxmertForPreTraining(TFLxmertPreTrainedModel): ...@@ -1334,7 +1334,7 @@ class TFLxmertForPreTraining(TFLxmertPreTrainedModel):
- 0 indicates that the sentence does not match the image, - 0 indicates that the sentence does not match the image,
- 1 indicates that the sentence does match the image. - 1 indicates that the sentence does match the image.
ans (`Torch.Tensor` of shape `(batch_size)`, *optional*, defaults to :obj: *None*): ans (`Torch.Tensor` of shape `(batch_size)`, *optional*, defaults to :obj: `None`):
a one hot representation hof the correct answer *optional* a one hot representation hof the correct answer *optional*
Returns: Returns:
......
...@@ -991,8 +991,8 @@ class MLukeTokenizer(PreTrainedTokenizer): ...@@ -991,8 +991,8 @@ class MLukeTokenizer(PreTrainedTokenizer):
Prepares a sequence of input id, entity id and entity span, or a pair of sequences of inputs ids, entity ids, Prepares a sequence of input id, entity id and entity span, or a pair of sequences of inputs ids, entity ids,
entity spans so that it can be used by the model. It adds special tokens, truncates sequences if overflowing entity spans so that it can be used by the model. It adds special tokens, truncates sequences if overflowing
while taking into account the special tokens and manages a moving window (with user defined stride) for while taking into account the special tokens and manages a moving window (with user defined stride) for
overflowing tokens. Please Note, for *pair_ids* different than *None* and *truncation_strategy = longest_first* overflowing tokens. Please Note, for *pair_ids* different than `None` and *truncation_strategy = longest_first*
or *True*, it is not possible to return overflowing tokens. Such a combination of arguments will raise an or `True`, it is not possible to return overflowing tokens. Such a combination of arguments will raise an
error. error.
Args: Args:
......
...@@ -2035,7 +2035,7 @@ class PerceiverBasicDecoder(PerceiverAbstractDecoder): ...@@ -2035,7 +2035,7 @@ class PerceiverBasicDecoder(PerceiverAbstractDecoder):
config ([*PerceiverConfig*]): config ([*PerceiverConfig*]):
Model configuration. Model configuration.
output_num_channels (`int`, *optional*): output_num_channels (`int`, *optional*):
The number of channels in the output. Will only be used in case *final_project* is set to *True*. The number of channels in the output. Will only be used in case *final_project* is set to `True`.
position_encoding_type (`str`, *optional*, defaults to "trainable"): position_encoding_type (`str`, *optional*, defaults to "trainable"):
The type of position encoding to use. Can be either "trainable", "fourier", or "none". The type of position encoding to use. Can be either "trainable", "fourier", or "none".
output_index_dims (`int`, *optional*): output_index_dims (`int`, *optional*):
...@@ -2583,7 +2583,7 @@ def generate_fourier_features(pos, num_bands, max_resolution=(224, 224), concat_ ...@@ -2583,7 +2583,7 @@ def generate_fourier_features(pos, num_bands, max_resolution=(224, 224), concat_
Returns: Returns:
`torch.FloatTensor` of shape `(batch_size, sequence_length, n_channels)`: The Fourier position embeddings. If `torch.FloatTensor` of shape `(batch_size, sequence_length, n_channels)`: The Fourier position embeddings. If
`concat_pos` is *True* and `sine_only` is *False*, output dimensions are ordered as: [dim_1, dim_2, ..., dim_d, `concat_pos` is `True` and `sine_only` is `False`, output dimensions are ordered as: [dim_1, dim_2, ..., dim_d,
sin(pi*f_1*dim_1), ..., sin(pi*f_K*dim_1), ..., sin(pi*f_1*dim_d), ..., sin(pi*f_K*dim_d), cos(pi*f_1*dim_1), sin(pi*f_1*dim_1), ..., sin(pi*f_K*dim_1), ..., sin(pi*f_1*dim_d), ..., sin(pi*f_K*dim_d), cos(pi*f_1*dim_1),
..., cos(pi*f_K*dim_1), ..., cos(pi*f_1*dim_d), ..., cos(pi*f_K*dim_d)], where dim_i is pos[:, i] and f_k is the ..., cos(pi*f_K*dim_1), ..., cos(pi*f_1*dim_d), ..., cos(pi*f_K*dim_d)], where dim_i is pos[:, i] and f_k is the
kth frequency band. kth frequency band.
......
...@@ -258,7 +258,7 @@ class RagPreTrainedModel(PreTrainedModel): ...@@ -258,7 +258,7 @@ class RagPreTrainedModel(PreTrainedModel):
the model, you need to first set it back in training mode with `model.train()`. the model, you need to first set it back in training mode with `model.train()`.
Params: Params:
question_encoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to *None*): question_encoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to `None`):
Information necessary to initiate the question encoder. Can be either: Information necessary to initiate the question encoder. Can be either:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
...@@ -271,7 +271,7 @@ class RagPreTrainedModel(PreTrainedModel): ...@@ -271,7 +271,7 @@ class RagPreTrainedModel(PreTrainedModel):
`config` argument. This loading path is slower than converting the TensorFlow checkpoint in a `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a
PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards. PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
generator_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to *None*): generator_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to `None`):
Information necessary to initiate the generator. Can be either: Information necessary to initiate the generator. Can be either:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
...@@ -444,7 +444,7 @@ RAG_FORWARD_INPUTS_DOCSTRING = r""" ...@@ -444,7 +444,7 @@ RAG_FORWARD_INPUTS_DOCSTRING = r"""
Used by the ([`RagModel`]) model during decoding. Used by the ([`RagModel`]) model during decoding.
decoder_input_ids (`torch.LongTensor` of shape `(batch_size, target_sequence_length)`, *optional*): decoder_input_ids (`torch.LongTensor` of shape `(batch_size, target_sequence_length)`, *optional*):
Provide for generation tasks. *None* by default, construct as per instructions for the generator model Provide for generation tasks. `None` by default, construct as per instructions for the generator model
you're using with your RAG instance. you're using with your RAG instance.
decoder_attention_mask (`torch.BoolTensor` of shape `(batch_size, target_sequence_length)`, *optional*): decoder_attention_mask (`torch.BoolTensor` of shape `(batch_size, target_sequence_length)`, *optional*):
Default behavior: generate a tensor that ignores pad tokens in `decoder_input_ids`. Causal mask will also Default behavior: generate a tensor that ignores pad tokens in `decoder_input_ids`. Causal mask will also
......
...@@ -245,7 +245,7 @@ class TFRagPreTrainedModel(TFPreTrainedModel): ...@@ -245,7 +245,7 @@ class TFRagPreTrainedModel(TFPreTrainedModel):
- A path or url to a *pytorch index checkpoint file* (e.g, `./pt_model/`). In this case, - A path or url to a *pytorch index checkpoint file* (e.g, `./pt_model/`). In this case,
`question_encoder_from_pt` should be set to `True`. `question_encoder_from_pt` should be set to `True`.
generator_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to *None*): generator_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to `None`):
Information necessary to initiate the generator. Can be either: Information necessary to initiate the generator. Can be either:
- A string with the *shortcut name* of a pretrained model to load from cache or download, e.g., - A string with the *shortcut name* of a pretrained model to load from cache or download, e.g.,
...@@ -426,7 +426,7 @@ RAG_FORWARD_INPUTS_DOCSTRING = r""" ...@@ -426,7 +426,7 @@ RAG_FORWARD_INPUTS_DOCSTRING = r"""
Used by the ([`TFRagModel`]) model during decoding. Used by the ([`TFRagModel`]) model during decoding.
decoder_input_ids (`tf.Tensor` of shape `(batch_size, target_sequence_length)`, *optional*): decoder_input_ids (`tf.Tensor` of shape `(batch_size, target_sequence_length)`, *optional*):
Provide for generation tasks. *None* by default, construct as per instructions for the generator model Provide for generation tasks. `None` by default, construct as per instructions for the generator model
you're using with your RAG instance. you're using with your RAG instance.
decoder_attention_mask (`torch.BoolTensor` of shape `(batch_size, target_sequence_length)`, *optional*): decoder_attention_mask (`torch.BoolTensor` of shape `(batch_size, target_sequence_length)`, *optional*):
Default behavior: generate a tensor that ignores pad tokens in `decoder_input_ids`. Causal mask will also Default behavior: generate a tensor that ignores pad tokens in `decoder_input_ids`. Causal mask will also
...@@ -1136,15 +1136,15 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss ...@@ -1136,15 +1136,15 @@ class TFRagTokenForGeneration(TFRagPreTrainedModel, TFCausalLanguageModelingLoss
encoder-decoder model starts decoding with a different token than *bos*, the id of that token. encoder-decoder model starts decoding with a different token than *bos*, the id of that token.
n_docs (`int`, *optional*, defaults to `config.n_docs`) n_docs (`int`, *optional*, defaults to `config.n_docs`)
Number of documents to retrieve and/or number of documents for which to generate an answer. Number of documents to retrieve and/or number of documents for which to generate an answer.
output_attentions (`bool`, *optional*, defaults to *False*): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not to return the attentions tensors of all attention layers. See `attentions` under Whether or not to return the attentions tensors of all attention layers. See `attentions` under
returned tensors for more details. returned tensors for more details.
output_hidden_states (`bool`, *optional*, defaults to *False*): output_hidden_states (`bool`, *optional*, defaults to `False`):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors
for more details. for more details.
output_scores (`bool`, *optional*, defaults to *False*): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to *False*): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple.
model_specific_kwargs: model_specific_kwargs:
Additional model specific kwargs will be forwarded to the `forward` function of the model. Additional model specific kwargs will be forwarded to the `forward` function of the model.
......
...@@ -300,7 +300,7 @@ class SpeechEncoderDecoderModel(PreTrainedModel): ...@@ -300,7 +300,7 @@ class SpeechEncoderDecoderModel(PreTrainedModel):
`config` argument. This loading path is slower than converting the TensorFlow checkpoint in a `config` argument. This loading path is slower than converting the TensorFlow checkpoint in a
PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards. PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
decoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to *None*): decoder_pretrained_model_name_or_path (:obj: *str*, *optional*, defaults to `None`):
Information necessary to initiate the decoder. Can be either: Information necessary to initiate the decoder. Can be either:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
......
...@@ -720,7 +720,7 @@ class FlaxVisionEncoderDecoderModel(FlaxPreTrainedModel): ...@@ -720,7 +720,7 @@ class FlaxVisionEncoderDecoderModel(FlaxPreTrainedModel):
- A path to a *directory* containing model weights saved using - A path to a *directory* containing model weights saved using
[`~FlaxPreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`. [`~FlaxPreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`.
decoder_pretrained_model_name_or_path (:obj: *Union[str, os.PathLike]*, *optional*, defaults to *None*): decoder_pretrained_model_name_or_path (:obj: *Union[str, os.PathLike]*, *optional*, defaults to `None`):
Information necessary to initiate the decoder. Can be either: Information necessary to initiate the decoder. Can be either:
- A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment