Unverified Commit 088c1880 authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Big file_utils cleanup (#16396)

* Big file_utils cleanup

* This one still needs to be treated separately
parent 2b23e080
...@@ -72,7 +72,7 @@ You are not required to read the following guidelines before opening an issue. H ...@@ -72,7 +72,7 @@ You are not required to read the following guidelines before opening an issue. H
from . import dependency_versions_check from . import dependency_versions_check
File "/transformers/src/transformers/dependency_versions_check.py", line 34, in <module> File "/transformers/src/transformers/dependency_versions_check.py", line 34, in <module>
from .utils import is_tokenizers_available from .utils import is_tokenizers_available
File "/transformers/src/transformers/file_utils.py", line 40, in <module> File "/transformers/src/transformers/utils/import_utils.py", line 40, in <module>
from tqdm.auto import tqdm from tqdm.auto import tqdm
ModuleNotFoundError: No module named 'tqdm.auto' ModuleNotFoundError: No module named 'tqdm.auto'
``` ```
...@@ -125,7 +125,7 @@ You are not required to read the following guidelines before opening an issue. H ...@@ -125,7 +125,7 @@ You are not required to read the following guidelines before opening an issue. H
from . import dependency_versions_check from . import dependency_versions_check
File "/transformers/src/transformers/dependency_versions_check.py", line 34, in <module> File "/transformers/src/transformers/dependency_versions_check.py", line 34, in <module>
from .utils import is_tokenizers_available from .utils import is_tokenizers_available
File "/transformers/src/transformers/file_utils.py", line 40, in <module> File "/transformers/src/transformers/utils/import_utils.py", line 40, in <module>
from tqdm.auto import tqdm from tqdm.auto import tqdm
ModuleNotFoundError: No module named 'tqdm.auto' ModuleNotFoundError: No module named 'tqdm.auto'
``` ```
......
...@@ -172,9 +172,9 @@ adds a link to its documentation with this syntax: \[\`XXXClass\`\] or \[\`funct ...@@ -172,9 +172,9 @@ adds a link to its documentation with this syntax: \[\`XXXClass\`\] or \[\`funct
function to be in the main package. function to be in the main package.
If you want to create a link to some internal class or function, you need to If you want to create a link to some internal class or function, you need to
provide its path. For instance: \[\`file_utils.ModelOutput\`\]. This will be converted into a link with provide its path. For instance: \[\`utils.ModelOutput\`\]. This will be converted into a link with
`file_utils.ModelOutput` in the description. To get rid of the path and only keep the name of the object you are `utils.ModelOutput` in the description. To get rid of the path and only keep the name of the object you are
linking to in the description, add a ~: \[\`~file_utils.ModelOutput\`\] will generate a link with `ModelOutput` in the description. linking to in the description, add a ~: \[\`~utils.ModelOutput\`\] will generate a link with `ModelOutput` in the description.
The same works for methods so you can either use \[\`XXXClass.method\`\] or \[~\`XXXClass.method\`\]. The same works for methods so you can either use \[\`XXXClass.method\`\] or \[~\`XXXClass.method\`\].
......
...@@ -381,7 +381,7 @@ important. Here is some advice is to make your debugging environment as efficien ...@@ -381,7 +381,7 @@ important. Here is some advice is to make your debugging environment as efficien
original code so that you can directly input the ids instead of an input string. original code so that you can directly input the ids instead of an input string.
- Make sure that the model in your debugging setup is **not** in training mode, which often causes the model to yield - Make sure that the model in your debugging setup is **not** in training mode, which often causes the model to yield
random outputs due to multiple dropout layers in the model. Make sure that the forward pass in your debugging random outputs due to multiple dropout layers in the model. Make sure that the forward pass in your debugging
environment is **deterministic** so that the dropout layers are not used. Or use *transformers.file_utils.set_seed* environment is **deterministic** so that the dropout layers are not used. Or use *transformers.utils.set_seed*
if the old and new implementations are in the same framework. if the old and new implementations are in the same framework.
The following section gives you more specific details/tips on how you can do this for *brand_new_bert*. The following section gives you more specific details/tips on how you can do this for *brand_new_bert*.
......
...@@ -12,35 +12,35 @@ specific language governing permissions and limitations under the License. ...@@ -12,35 +12,35 @@ specific language governing permissions and limitations under the License.
# General Utilities # General Utilities
This page lists all of Transformers general utility functions that are found in the file `file_utils.py`. This page lists all of Transformers general utility functions that are found in the file `utils.py`.
Most of those are only useful if you are studying the general code in the library. Most of those are only useful if you are studying the general code in the library.
## Enums and namedtuples ## Enums and namedtuples
[[autodoc]] file_utils.ExplicitEnum [[autodoc]] utils.ExplicitEnum
[[autodoc]] file_utils.PaddingStrategy [[autodoc]] utils.PaddingStrategy
[[autodoc]] file_utils.TensorType [[autodoc]] utils.TensorType
## Special Decorators ## Special Decorators
[[autodoc]] file_utils.add_start_docstrings [[autodoc]] utils.add_start_docstrings
[[autodoc]] file_utils.add_start_docstrings_to_model_forward [[autodoc]] utils.add_start_docstrings_to_model_forward
[[autodoc]] file_utils.add_end_docstrings [[autodoc]] utils.add_end_docstrings
[[autodoc]] file_utils.add_code_sample_docstrings [[autodoc]] utils.add_code_sample_docstrings
[[autodoc]] file_utils.replace_return_docstrings [[autodoc]] utils.replace_return_docstrings
## Special Properties ## Special Properties
[[autodoc]] file_utils.cached_property [[autodoc]] utils.cached_property
## Other Utilities ## Other Utilities
[[autodoc]] file_utils._LazyModule [[autodoc]] utils._LazyModule
...@@ -25,7 +25,7 @@ Most of those are only useful if you are studying the code of the generate metho ...@@ -25,7 +25,7 @@ Most of those are only useful if you are studying the code of the generate metho
## Generate Outputs ## Generate Outputs
The output of [`~generation_utils.GenerationMixin.generate`] is an instance of a subclass of The output of [`~generation_utils.GenerationMixin.generate`] is an instance of a subclass of
[`~file_utils.ModelOutput`]. This output is a data structure containing all the information returned [`~utils.ModelOutput`]. This output is a data structure containing all the information returned
by [`~generation_utils.GenerationMixin.generate`], but that can also be used as tuple or dictionary. by [`~generation_utils.GenerationMixin.generate`], but that can also be used as tuple or dictionary.
Here's an example: Here's an example:
......
...@@ -88,4 +88,4 @@ Due to Pytorch design, this functionality is only available for floating dtypes. ...@@ -88,4 +88,4 @@ Due to Pytorch design, this functionality is only available for floating dtypes.
## Pushing to the Hub ## Pushing to the Hub
[[autodoc]] file_utils.PushToHubMixin [[autodoc]] utils.PushToHubMixin
...@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License. ...@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
# Model outputs # Model outputs
All models have outputs that are instances of subclasses of [`~file_utils.ModelOutput`]. Those are All models have outputs that are instances of subclasses of [`~utils.ModelOutput`]. Those are
data structures containing all the information returned by the model, but that can also be used as tuples or data structures containing all the information returned by the model, but that can also be used as tuples or
dictionaries. dictionaries.
...@@ -57,7 +57,7 @@ documented on their corresponding model page. ...@@ -57,7 +57,7 @@ documented on their corresponding model page.
## ModelOutput ## ModelOutput
[[autodoc]] file_utils.ModelOutput [[autodoc]] utils.ModelOutput
- to_tuple - to_tuple
## BaseModelOutput ## BaseModelOutput
......
...@@ -40,7 +40,7 @@ The [`Trainer`] contains the basic training loop which supports the above featur ...@@ -40,7 +40,7 @@ The [`Trainer`] contains the basic training loop which supports the above featur
The [`Trainer`] class is optimized for 🤗 Transformers models and can have surprising behaviors The [`Trainer`] class is optimized for 🤗 Transformers models and can have surprising behaviors
when you use it on other models. When using it on your own model, make sure: when you use it on other models. When using it on your own model, make sure:
- your model always return tuples or subclasses of [`~file_utils.ModelOutput`]. - your model always return tuples or subclasses of [`~utils.ModelOutput`].
- your model can compute the loss if a `labels` argument is provided and that loss is returned as the first - your model can compute the loss if a `labels` argument is provided and that loss is returned as the first
element of the tuple (if your model returns tuples) element of the tuple (if your model returns tuples)
- your model can accept multiple label arguments (use the `label_names` in your [`TrainingArguments`] to indicate their name to the [`Trainer`]) but none of them should be named `"label"`. - your model can accept multiple label arguments (use the `label_names` in your [`TrainingArguments`] to indicate their name to the [`Trainer`]) but none of them should be named `"label"`.
......
...@@ -855,7 +855,7 @@ If you need to switch a tensor to bf16, it's just: `t.to(dtype=torch.bfloat16)` ...@@ -855,7 +855,7 @@ If you need to switch a tensor to bf16, it's just: `t.to(dtype=torch.bfloat16)`
Here is how you can check if your setup supports bf16: Here is how you can check if your setup supports bf16:
``` ```
python -c 'import transformers; print(f"BF16 support is {transformers.file_utils.is_torch_bf16_available()}")' python -c 'import transformers; print(f"BF16 support is {transformers.utils.is_torch_bf16_available()}")'
``` ```
On the other hand bf16 has a much worse precision than fp16, so there are certain situations where you'd still want to use fp16 and not bf16. On the other hand bf16 has a much worse precision than fp16, so there are certain situations where you'd still want to use fp16 and not bf16.
......
...@@ -153,7 +153,7 @@ class DataCollatorForMultipleChoice: ...@@ -153,7 +153,7 @@ class DataCollatorForMultipleChoice:
Args: Args:
tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]): tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]):
The tokenizer used for encoding the data. The tokenizer used for encoding the data.
padding (`bool`, `str` or [`~file_utils.PaddingStrategy`], *optional*, defaults to `True`): padding (`bool`, `str` or [`~utils.PaddingStrategy`], *optional*, defaults to `True`):
Select a strategy to pad the returned sequences (according to the model's padding side and padding index) Select a strategy to pad the returned sequences (according to the model's padding side and padding index)
among: among:
......
...@@ -193,7 +193,7 @@ class DataCollatorForMultipleChoice: ...@@ -193,7 +193,7 @@ class DataCollatorForMultipleChoice:
Args: Args:
tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]): tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]):
The tokenizer used for encoding the data. The tokenizer used for encoding the data.
padding (`bool`, `str` or [`~file_utils.PaddingStrategy`], *optional*, defaults to `True`): padding (`bool`, `str` or [`~utils.PaddingStrategy`], *optional*, defaults to `True`):
Select a strategy to pad the returned sequences (according to the model's padding side and padding index) Select a strategy to pad the returned sequences (according to the model's padding side and padding index)
among: among:
......
...@@ -74,7 +74,7 @@ class DataCollatorForMultipleChoice: ...@@ -74,7 +74,7 @@ class DataCollatorForMultipleChoice:
Args: Args:
tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]): tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]):
The tokenizer used for encoding the data. The tokenizer used for encoding the data.
padding (`bool`, `str` or [`~file_utils.PaddingStrategy`], *optional*, defaults to `True`): padding (`bool`, `str` or [`~utils.PaddingStrategy`], *optional*, defaults to `True`):
Select a strategy to pad the returned sequences (according to the model's padding side and padding index) Select a strategy to pad the returned sequences (according to the model's padding side and padding index)
among: among:
......
...@@ -784,7 +784,7 @@ def clean_frameworks_in_init( ...@@ -784,7 +784,7 @@ def clean_frameworks_in_init(
indent = find_indent(lines[idx]) indent = find_indent(lines[idx])
while find_indent(lines[idx]) >= indent or is_empty_line(lines[idx]): while find_indent(lines[idx]) >= indent or is_empty_line(lines[idx]):
idx += 1 idx += 1
# Remove the import from file_utils # Remove the import from utils
elif re_is_xxx_available.search(lines[idx]) is not None: elif re_is_xxx_available.search(lines[idx]) is not None:
line = lines[idx] line = lines[idx]
for framework in to_remove: for framework in to_remove:
......
...@@ -93,7 +93,7 @@ class PretrainedConfig(PushToHubMixin): ...@@ -93,7 +93,7 @@ class PretrainedConfig(PushToHubMixin):
output_attentions (`bool`, *optional*, defaults to `False`): output_attentions (`bool`, *optional*, defaults to `False`):
Whether or not the model should returns all attentions. Whether or not the model should returns all attentions.
return_dict (`bool`, *optional*, defaults to `True`): return_dict (`bool`, *optional*, defaults to `True`):
Whether or not the model should return a [`~transformers.file_utils.ModelOutput`] instead of a plain tuple. Whether or not the model should return a [`~transformers.utils.ModelOutput`] instead of a plain tuple.
is_encoder_decoder (`bool`, *optional*, defaults to `False`): is_encoder_decoder (`bool`, *optional*, defaults to `False`):
Whether the model is used as an encoder/decoder or not. Whether the model is used as an encoder/decoder or not.
is_decoder (`bool`, *optional*, defaults to `False`): is_decoder (`bool`, *optional*, defaults to `False`):
...@@ -170,7 +170,7 @@ class PretrainedConfig(PushToHubMixin): ...@@ -170,7 +170,7 @@ class PretrainedConfig(PushToHubMixin):
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether the model should return the logits when used for generation. Whether the model should return the logits when used for generation.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether the model should return a [`~transformers.file_utils.ModelOutput`] instead of a `torch.LongTensor`. Whether the model should return a [`~transformers.utils.ModelOutput`] instead of a `torch.LongTensor`.
forced_bos_token_id (`int`, *optional*): forced_bos_token_id (`int`, *optional*):
The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful for The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful for
multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be the target multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be the target
...@@ -379,7 +379,7 @@ class PretrainedConfig(PushToHubMixin): ...@@ -379,7 +379,7 @@ class PretrainedConfig(PushToHubMixin):
@property @property
def use_return_dict(self) -> bool: def use_return_dict(self) -> bool:
""" """
`bool`: Whether or not return [`~file_utils.ModelOutput`] instead of tuples. `bool`: Whether or not return [`~utils.ModelOutput`] instead of tuples.
""" """
# If torchscript is set, force `return_dict=False` to avoid jit errors # If torchscript is set, force `return_dict=False` to avoid jit errors
return self.return_dict and not self.torchscript return self.return_dict and not self.torchscript
...@@ -417,7 +417,7 @@ class PretrainedConfig(PushToHubMixin): ...@@ -417,7 +417,7 @@ class PretrainedConfig(PushToHubMixin):
</Tip> </Tip>
kwargs: kwargs:
Additional key word arguments passed along to the [`~file_utils.PushToHubMixin.push_to_hub`] method. Additional key word arguments passed along to the [`~utils.PushToHubMixin.push_to_hub`] method.
""" """
if os.path.isfile(save_directory): if os.path.isfile(save_directory):
raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file") raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file")
......
...@@ -216,7 +216,7 @@ class DataCollatorWithPadding: ...@@ -216,7 +216,7 @@ class DataCollatorWithPadding:
Args: Args:
tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]): tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]):
The tokenizer used for encoding the data. The tokenizer used for encoding the data.
padding (`bool`, `str` or [`~file_utils.PaddingStrategy`], *optional*, defaults to `True`): padding (`bool`, `str` or [`~utils.PaddingStrategy`], *optional*, defaults to `True`):
Select a strategy to pad the returned sequences (according to the model's padding side and padding index) Select a strategy to pad the returned sequences (according to the model's padding side and padding index)
among: among:
...@@ -268,7 +268,7 @@ class DataCollatorForTokenClassification(DataCollatorMixin): ...@@ -268,7 +268,7 @@ class DataCollatorForTokenClassification(DataCollatorMixin):
Args: Args:
tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]): tokenizer ([`PreTrainedTokenizer`] or [`PreTrainedTokenizerFast`]):
The tokenizer used for encoding the data. The tokenizer used for encoding the data.
padding (`bool`, `str` or [`~file_utils.PaddingStrategy`], *optional*, defaults to `True`): padding (`bool`, `str` or [`~utils.PaddingStrategy`], *optional*, defaults to `True`):
Select a strategy to pad the returned sequences (according to the model's padding side and padding index) Select a strategy to pad the returned sequences (according to the model's padding side and padding index)
among: among:
...@@ -523,7 +523,7 @@ class DataCollatorForSeq2Seq: ...@@ -523,7 +523,7 @@ class DataCollatorForSeq2Seq:
prepare the *decoder_input_ids* prepare the *decoder_input_ids*
This is useful when using *label_smoothing* to avoid calculating loss twice. This is useful when using *label_smoothing* to avoid calculating loss twice.
padding (`bool`, `str` or [`~file_utils.PaddingStrategy`], *optional*, defaults to `True`): padding (`bool`, `str` or [`~utils.PaddingStrategy`], *optional*, defaults to `True`):
Select a strategy to pad the returned sequences (according to the model's padding side and padding index) Select a strategy to pad the returned sequences (according to the model's padding side and padding index)
among: among:
......
...@@ -90,7 +90,7 @@ class SequenceFeatureExtractor(FeatureExtractionMixin): ...@@ -90,7 +90,7 @@ class SequenceFeatureExtractor(FeatureExtractionMixin):
Instead of `List[float]` you can have tensors (numpy arrays, PyTorch tensors or TensorFlow tensors), Instead of `List[float]` you can have tensors (numpy arrays, PyTorch tensors or TensorFlow tensors),
see the note above for the return type. see the note above for the return type.
padding (`bool`, `str` or [`~file_utils.PaddingStrategy`], *optional*, defaults to `True`): padding (`bool`, `str` or [`~utils.PaddingStrategy`], *optional*, defaults to `True`):
Select a strategy to pad the returned sequences (according to the model's padding side and padding Select a strategy to pad the returned sequences (according to the model's padding side and padding
index) among: index) among:
...@@ -114,7 +114,7 @@ class SequenceFeatureExtractor(FeatureExtractionMixin): ...@@ -114,7 +114,7 @@ class SequenceFeatureExtractor(FeatureExtractionMixin):
to the specific feature_extractor's default. to the specific feature_extractor's default.
[What are attention masks?](../glossary#attention-mask) [What are attention masks?](../glossary#attention-mask)
return_tensors (`str` or [`~file_utils.TensorType`], *optional*): return_tensors (`str` or [`~utils.TensorType`], *optional*):
If set, will return tensors instead of list of python integers. Acceptable values are: If set, will return tensors instead of list of python integers. Acceptable values are:
- `'tf'`: Return TensorFlow `tf.constant` objects. - `'tf'`: Return TensorFlow `tf.constant` objects.
......
...@@ -117,9 +117,9 @@ class BatchFeature(UserDict): ...@@ -117,9 +117,9 @@ class BatchFeature(UserDict):
Convert the inner content to tensors. Convert the inner content to tensors.
Args: Args:
tensor_type (`str` or [`~file_utils.TensorType`], *optional*): tensor_type (`str` or [`~utils.TensorType`], *optional*):
The type of tensors to use. If `str`, should be one of the values of the enum The type of tensors to use. If `str`, should be one of the values of the enum [`~utils.TensorType`]. If
[`~file_utils.TensorType`]. If `None`, no modification is done. `None`, no modification is done.
""" """
if tensor_type is None: if tensor_type is None:
return self return self
...@@ -328,7 +328,7 @@ class FeatureExtractionMixin(PushToHubMixin): ...@@ -328,7 +328,7 @@ class FeatureExtractionMixin(PushToHubMixin):
</Tip> </Tip>
kwargs: kwargs:
Additional key word arguments passed along to the [`~file_utils.PushToHubMixin.push_to_hub`] method. Additional key word arguments passed along to the [`~utils.PushToHubMixin.push_to_hub`] method.
""" """
if os.path.isfile(save_directory): if os.path.isfile(save_directory):
raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file") raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file")
......
...@@ -241,7 +241,7 @@ class FlaxGenerationMixin: ...@@ -241,7 +241,7 @@ class FlaxGenerationMixin:
should be prefixed with *decoder_*. Also accepts `encoder_outputs` to skip encoder part. should be prefixed with *decoder_*. Also accepts `encoder_outputs` to skip encoder part.
Return: Return:
[`~file_utils.ModelOutput`]. [`~utils.ModelOutput`].
Examples: Examples:
......
...@@ -469,7 +469,7 @@ class TFGenerationMixin: ...@@ -469,7 +469,7 @@ class TFGenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
forced_bos_token_id (`int`, *optional*): forced_bos_token_id (`int`, *optional*):
The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful
for multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be for multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be
...@@ -480,11 +480,11 @@ class TFGenerationMixin: ...@@ -480,11 +480,11 @@ class TFGenerationMixin:
Additional model specific kwargs will be forwarded to the `forward` function of the model. Additional model specific kwargs will be forwarded to the `forward` function of the model.
Return: Return:
[`~file_utils.ModelOutput`] or `tf.Tensor`: A [`~file_utils.ModelOutput`] (if [`~utils.ModelOutput`] or `tf.Tensor`: A [`~utils.ModelOutput`] (if `return_dict_in_generate=True` or when
`return_dict_in_generate=True` or when `config.return_dict_in_generate=True`) or a `tf.Tensor`. `config.return_dict_in_generate=True`) or a `tf.Tensor`.
If the model is *not* an encoder-decoder model (`model.config.is_encoder_decoder=False`), the possible If the model is *not* an encoder-decoder model (`model.config.is_encoder_decoder=False`), the possible
[`~file_utils.ModelOutput`] types are: [`~utils.ModelOutput`] types are:
- [`~generation_tf_utils.TFGreedySearchDecoderOnlyOutput`], - [`~generation_tf_utils.TFGreedySearchDecoderOnlyOutput`],
- [`~generation_tf_utils.TFSampleDecoderOnlyOutput`], - [`~generation_tf_utils.TFSampleDecoderOnlyOutput`],
...@@ -492,7 +492,7 @@ class TFGenerationMixin: ...@@ -492,7 +492,7 @@ class TFGenerationMixin:
- [`~generation_tf_utils.TFBeamSampleDecoderOnlyOutput`] - [`~generation_tf_utils.TFBeamSampleDecoderOnlyOutput`]
If the model is an encoder-decoder model (`model.config.is_encoder_decoder=True`), the possible If the model is an encoder-decoder model (`model.config.is_encoder_decoder=True`), the possible
[`~file_utils.ModelOutput`] types are: [`~utils.ModelOutput`] types are:
- [`~generation_tf_utils.TFGreedySearchEncoderDecoderOutput`], - [`~generation_tf_utils.TFGreedySearchEncoderDecoderOutput`],
- [`~generation_tf_utils.TFSampleEncoderDecoderOutput`], - [`~generation_tf_utils.TFSampleEncoderDecoderOutput`],
...@@ -1370,7 +1370,7 @@ class TFGenerationMixin: ...@@ -1370,7 +1370,7 @@ class TFGenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
forced_bos_token_id (`int`, *optional*): forced_bos_token_id (`int`, *optional*):
The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful
for multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be for multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be
...@@ -1381,11 +1381,11 @@ class TFGenerationMixin: ...@@ -1381,11 +1381,11 @@ class TFGenerationMixin:
Additional model specific kwargs will be forwarded to the `forward` function of the model. Additional model specific kwargs will be forwarded to the `forward` function of the model.
Return: Return:
[`~file_utils.ModelOutput`] or `tf.Tensor`: A [`~file_utils.ModelOutput`] (if [`~utils.ModelOutput`] or `tf.Tensor`: A [`~utils.ModelOutput`] (if `return_dict_in_generate=True` or when
`return_dict_in_generate=True` or when `config.return_dict_in_generate=True`) or a `tf.Tensor`. `config.return_dict_in_generate=True`) or a `tf.Tensor`.
If the model is *not* an encoder-decoder model (`model.config.is_encoder_decoder=False`), the possible If the model is *not* an encoder-decoder model (`model.config.is_encoder_decoder=False`), the possible
[`~file_utils.ModelOutput`] types are: [`~utils.ModelOutput`] types are:
- [`~generation_tf_utils.TFGreedySearchDecoderOnlyOutput`], - [`~generation_tf_utils.TFGreedySearchDecoderOnlyOutput`],
- [`~generation_tf_utils.TFSampleDecoderOnlyOutput`], - [`~generation_tf_utils.TFSampleDecoderOnlyOutput`],
...@@ -1393,7 +1393,7 @@ class TFGenerationMixin: ...@@ -1393,7 +1393,7 @@ class TFGenerationMixin:
- [`~generation_tf_utils.TFBeamSampleDecoderOnlyOutput`] - [`~generation_tf_utils.TFBeamSampleDecoderOnlyOutput`]
If the model is an encoder-decoder model (`model.config.is_encoder_decoder=True`), the possible If the model is an encoder-decoder model (`model.config.is_encoder_decoder=True`), the possible
[`~file_utils.ModelOutput`] types are: [`~utils.ModelOutput`] types are:
- [`~generation_tf_utils.TFGreedySearchEncoderDecoderOutput`], - [`~generation_tf_utils.TFGreedySearchEncoderDecoderOutput`],
- [`~generation_tf_utils.TFSampleEncoderDecoderOutput`], - [`~generation_tf_utils.TFSampleEncoderDecoderOutput`],
...@@ -1822,7 +1822,7 @@ class TFGenerationMixin: ...@@ -1822,7 +1822,7 @@ class TFGenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
model_kwargs: model_kwargs:
Additional model specific keyword arguments will be forwarded to the `call` function of the model. If Additional model specific keyword arguments will be forwarded to the `call` function of the model. If
model is an encoder-decoder model the kwargs should include `encoder_outputs`. model is an encoder-decoder model the kwargs should include `encoder_outputs`.
...@@ -2085,7 +2085,7 @@ class TFGenerationMixin: ...@@ -2085,7 +2085,7 @@ class TFGenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
model_kwargs: model_kwargs:
Additional model specific kwargs will be forwarded to the `call` function of the model. If model is an Additional model specific kwargs will be forwarded to the `call` function of the model. If model is an
encoder-decoder model the kwargs should include `encoder_outputs`. encoder-decoder model the kwargs should include `encoder_outputs`.
......
...@@ -1003,7 +1003,7 @@ class GenerationMixin: ...@@ -1003,7 +1003,7 @@ class GenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
forced_bos_token_id (`int`, *optional*): forced_bos_token_id (`int`, *optional*):
The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful The id of the token to force as the first generated token after the `decoder_start_token_id`. Useful
for multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be for multilingual models like [mBART](../model_doc/mbart) where the first generated token needs to be
...@@ -1026,11 +1026,11 @@ class GenerationMixin: ...@@ -1026,11 +1026,11 @@ class GenerationMixin:
should be prefixed with *decoder_*. should be prefixed with *decoder_*.
Return: Return:
[`~file_utils.ModelOutput`] or `torch.LongTensor`: A [`~file_utils.ModelOutput`] (if [`~utils.ModelOutput`] or `torch.LongTensor`: A [`~utils.ModelOutput`] (if `return_dict_in_generate=True`
`return_dict_in_generate=True` or when `config.return_dict_in_generate=True`) or a `torch.FloatTensor`. or when `config.return_dict_in_generate=True`) or a `torch.FloatTensor`.
If the model is *not* an encoder-decoder model (`model.config.is_encoder_decoder=False`), the possible If the model is *not* an encoder-decoder model (`model.config.is_encoder_decoder=False`), the possible
[`~file_utils.ModelOutput`] types are: [`~utils.ModelOutput`] types are:
- [`~generation_utils.GreedySearchDecoderOnlyOutput`], - [`~generation_utils.GreedySearchDecoderOnlyOutput`],
- [`~generation_utils.SampleDecoderOnlyOutput`], - [`~generation_utils.SampleDecoderOnlyOutput`],
...@@ -1038,7 +1038,7 @@ class GenerationMixin: ...@@ -1038,7 +1038,7 @@ class GenerationMixin:
- [`~generation_utils.BeamSampleDecoderOnlyOutput`] - [`~generation_utils.BeamSampleDecoderOnlyOutput`]
If the model is an encoder-decoder model (`model.config.is_encoder_decoder=True`), the possible If the model is an encoder-decoder model (`model.config.is_encoder_decoder=True`), the possible
[`~file_utils.ModelOutput`] types are: [`~utils.ModelOutput`] types are:
- [`~generation_utils.GreedySearchEncoderDecoderOutput`], - [`~generation_utils.GreedySearchEncoderDecoderOutput`],
- [`~generation_utils.SampleEncoderDecoderOutput`], - [`~generation_utils.SampleEncoderDecoderOutput`],
...@@ -1531,7 +1531,7 @@ class GenerationMixin: ...@@ -1531,7 +1531,7 @@ class GenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
model_kwargs: model_kwargs:
...@@ -1767,7 +1767,7 @@ class GenerationMixin: ...@@ -1767,7 +1767,7 @@ class GenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
model_kwargs: model_kwargs:
...@@ -2022,7 +2022,7 @@ class GenerationMixin: ...@@ -2022,7 +2022,7 @@ class GenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
model_kwargs: model_kwargs:
...@@ -2339,7 +2339,7 @@ class GenerationMixin: ...@@ -2339,7 +2339,7 @@ class GenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
model_kwargs: model_kwargs:
...@@ -2656,7 +2656,7 @@ class GenerationMixin: ...@@ -2656,7 +2656,7 @@ class GenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
...@@ -3026,7 +3026,7 @@ class GenerationMixin: ...@@ -3026,7 +3026,7 @@ class GenerationMixin:
output_scores (`bool`, *optional*, defaults to `False`): output_scores (`bool`, *optional*, defaults to `False`):
Whether or not to return the prediction scores. See `scores` under returned tensors for more details. Whether or not to return the prediction scores. See `scores` under returned tensors for more details.
return_dict_in_generate (`bool`, *optional*, defaults to `False`): return_dict_in_generate (`bool`, *optional*, defaults to `False`):
Whether or not to return a [`~file_utils.ModelOutput`] instead of a plain tuple. Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.
synced_gpus (`bool`, *optional*, defaults to `False`): synced_gpus (`bool`, *optional*, defaults to `False`):
Whether to continue running the while loop until max_length (needed for ZeRO stage 3) Whether to continue running the while loop until max_length (needed for ZeRO stage 3)
model_kwargs: model_kwargs:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment