Unverified Commit 969859d5 authored by Santiago Castro's avatar Santiago Castro Committed by GitHub
Browse files

Fix doc errors and typos across the board (#8139)

* Fix doc errors and typos across the board

* Fix a typo

* Fix the CI

* Fix more typos

* Fix CI

* More fixes

* Fix CI

* More fixes

* More fixes
parent 4731a00c
...@@ -57,7 +57,7 @@ class LxmertConfig(PretrainedConfig): ...@@ -57,7 +57,7 @@ class LxmertConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, The non-linear activation function (function or string) in the encoder and pooler. If string,
:obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported. :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported.
hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.1): hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (:obj:`float`, `optional`, defaults to 0.1): attention_probs_dropout_prob (:obj:`float`, `optional`, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (:obj:`int`, `optional`, defaults to 512): max_position_embeddings (:obj:`int`, `optional`, defaults to 512):
...@@ -95,10 +95,9 @@ class LxmertConfig(PretrainedConfig): ...@@ -95,10 +95,9 @@ class LxmertConfig(PretrainedConfig):
Whether or not to add masked language modeling (as used in pretraining models such as BERT) to the loss Whether or not to add masked language modeling (as used in pretraining models such as BERT) to the loss
objective. objective.
task_obj_predict (:obj:`bool`, `optional`, defaults to :obj:`True`): task_obj_predict (:obj:`bool`, `optional`, defaults to :obj:`True`):
Whether or not to add object predicition, attribute predicition and feature regression to the loss Whether or not to add object prediction, attribute ppredictionand feature regression to the loss objective.
objective.
task_qa (:obj:`bool`, `optional`, defaults to :obj:`True`): task_qa (:obj:`bool`, `optional`, defaults to :obj:`True`):
Whether or not to add the question-asnwering loss to the objective Whether or not to add the question-asansweringoss to the objective
visual_obj_loss (:obj:`bool`, `optional`, defaults to :obj:`True`): visual_obj_loss (:obj:`bool`, `optional`, defaults to :obj:`True`):
Whether or not to calculate the object-prediction loss objective Whether or not to calculate the object-prediction loss objective
visual_attr_loss (:obj:`bool`, `optional`, defaults to :obj:`True`): visual_attr_loss (:obj:`bool`, `optional`, defaults to :obj:`True`):
...@@ -106,10 +105,10 @@ class LxmertConfig(PretrainedConfig): ...@@ -106,10 +105,10 @@ class LxmertConfig(PretrainedConfig):
visual_feat_loss (:obj:`bool`, `optional`, defaults to :obj:`True`): visual_feat_loss (:obj:`bool`, `optional`, defaults to :obj:`True`):
Whether or not to calculate the feature-regression loss objective Whether or not to calculate the feature-regression loss objective
output_attentions (:obj:`bool`, `optional`, defaults to :obj:`False`): output_attentions (:obj:`bool`, `optional`, defaults to :obj:`False`):
Whether or not the model should return the attentions from the vision, langauge, and cross-modality layers Whether or not the model should return the attentions from the vision, language, and cross-modality layers
should be returned. should be returned.
output_hidden_states (:obj:`bool`, `optional`, defaults to :obj:`False`): output_hidden_states (:obj:`bool`, `optional`, defaults to :obj:`False`):
Whether or not the model should return the hidden states from the vision, langauge, and cross-modality Whether or not the model should return the hidden states from the vision, language, and cross-modality
layers should be returned. layers should be returned.
""" """
......
...@@ -52,7 +52,7 @@ class MarianConfig(BartConfig): ...@@ -52,7 +52,7 @@ class MarianConfig(BartConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, The non-linear activation function (function or string) in the encoder and pooler. If string,
:obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported. :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported.
dropout (:obj:`float`, `optional`, defaults to 0.1): dropout (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_dropout (:obj:`float`, `optional`, defaults to 0.0): attention_dropout (:obj:`float`, `optional`, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
activation_dropout (:obj:`float`, `optional`, defaults to 0.0): activation_dropout (:obj:`float`, `optional`, defaults to 0.0):
......
...@@ -57,7 +57,7 @@ class MBartConfig(BartConfig): ...@@ -57,7 +57,7 @@ class MBartConfig(BartConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, The non-linear activation function (function or string) in the encoder and pooler. If string,
:obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported. :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported.
dropout (:obj:`float`, `optional`, defaults to 0.1): dropout (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_dropout (:obj:`float`, `optional`, defaults to 0.0): attention_dropout (:obj:`float`, `optional`, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
activation_dropout (:obj:`float`, `optional`, defaults to 0.0): activation_dropout (:obj:`float`, `optional`, defaults to 0.0):
......
...@@ -96,7 +96,7 @@ class PegasusConfig(BartConfig): ...@@ -96,7 +96,7 @@ class PegasusConfig(BartConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, The non-linear activation function (function or string) in the encoder and pooler. If string,
:obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported. :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported.
dropout (:obj:`float`, `optional`, defaults to 0.1): dropout (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_dropout (:obj:`float`, `optional`, defaults to 0.0): attention_dropout (:obj:`float`, `optional`, defaults to 0.0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
activation_dropout (:obj:`float`, `optional`, defaults to 0.0): activation_dropout (:obj:`float`, `optional`, defaults to 0.0):
......
...@@ -60,7 +60,7 @@ class ProphetNetConfig(PretrainedConfig): ...@@ -60,7 +60,7 @@ class ProphetNetConfig(PretrainedConfig):
attention_dropout (:obj:`float`, `optional`, defaults to 0.1): attention_dropout (:obj:`float`, `optional`, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
dropout (:obj:`float`, `optional`, defaults to 0.1): dropout (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
max_position_embeddings (:obj:`int`, `optional`, defaults to 512): max_position_embeddings (:obj:`int`, `optional`, defaults to 512):
The maximum sequence length that this model might ever be used with. Typically set this to something large The maximum sequence length that this model might ever be used with. Typically set this to something large
just in case (e.g., 512 or 1024 or 2048). just in case (e.g., 512 or 1024 or 2048).
......
...@@ -30,7 +30,7 @@ RAG_CONFIG_DOC = r""" ...@@ -30,7 +30,7 @@ RAG_CONFIG_DOC = r"""
Separator inserted between the title and the text of the retrieved document when calling Separator inserted between the title and the text of the retrieved document when calling
:class:`~transformers.RagRetriever`. :class:`~transformers.RagRetriever`.
doc_sep (:obj:`str`, `optional`, defaults to ``" // "``): doc_sep (:obj:`str`, `optional`, defaults to ``" // "``):
Separator inserted between the the text of the retrieved document and the original input when calliang Separator inserted between the the text of the retrieved document and the original input when calling
:class:`~transformers.RagRetriever`. :class:`~transformers.RagRetriever`.
n_docs (:obj:`int`, `optional`, defaults to 5): n_docs (:obj:`int`, `optional`, defaults to 5):
Number of documents to retrieve. Number of documents to retrieve.
...@@ -39,7 +39,7 @@ RAG_CONFIG_DOC = r""" ...@@ -39,7 +39,7 @@ RAG_CONFIG_DOC = r"""
retrieval_vector_size (:obj:`int`, `optional`, defaults to 768): retrieval_vector_size (:obj:`int`, `optional`, defaults to 768):
Dimensionality of the document embeddings indexed by :class:`~transformers.RagRetriever`. Dimensionality of the document embeddings indexed by :class:`~transformers.RagRetriever`.
retrieval_batch_size (:obj:`int`, `optional`, defaults to 8): retrieval_batch_size (:obj:`int`, `optional`, defaults to 8):
Retrieval batch size, defined as the number of queries issues concurrently to the faiss index excapsulated Retrieval batch size, defined as the number of queries issues concurrently to the faiss index encapsulated
:class:`~transformers.RagRetriever`. :class:`~transformers.RagRetriever`.
dataset (:obj:`str`, `optional`, defaults to :obj:`"wiki_dpr"`): dataset (:obj:`str`, `optional`, defaults to :obj:`"wiki_dpr"`):
A dataset identifier of the indexed dataset in HuggingFace Datasets (list all available datasets and ids A dataset identifier of the indexed dataset in HuggingFace Datasets (list all available datasets and ids
......
...@@ -82,7 +82,7 @@ class ReformerConfig(PretrainedConfig): ...@@ -82,7 +82,7 @@ class ReformerConfig(PretrainedConfig):
The non-linear activation function (function or string) in the feed forward layer in the residual attention The non-linear activation function (function or string) in the feed forward layer in the residual attention
block. If string, :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported. block. If string, :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported.
hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.05): hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.05):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
hidden_size (:obj:`int`, `optional`, defaults to 256): hidden_size (:obj:`int`, `optional`, defaults to 256):
Dimensionality of the output hidden states of the residual attention blocks. Dimensionality of the output hidden states of the residual attention blocks.
initializer_range (:obj:`float`, `optional`, defaults to 0.02): initializer_range (:obj:`float`, `optional`, defaults to 0.02):
......
...@@ -20,7 +20,7 @@ from .utils import logging ...@@ -20,7 +20,7 @@ from .utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
# TODO: uploadto AWS # TODO: upload to AWS
RETRIBERT_PRETRAINED_CONFIG_ARCHIVE_MAP = { RETRIBERT_PRETRAINED_CONFIG_ARCHIVE_MAP = {
"retribert-base-uncased": "https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-config.json", "retribert-base-uncased": "https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-config.json",
} }
...@@ -51,7 +51,7 @@ class RetriBertConfig(PretrainedConfig): ...@@ -51,7 +51,7 @@ class RetriBertConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, The non-linear activation function (function or string) in the encoder and pooler. If string,
:obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported. :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported.
hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.1): hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (:obj:`float`, `optional`, defaults to 0.1): attention_probs_dropout_prob (:obj:`float`, `optional`, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (:obj:`int`, `optional`, defaults to 512): max_position_embeddings (:obj:`int`, `optional`, defaults to 512):
......
...@@ -52,7 +52,7 @@ class SqueezeBertConfig(PretrainedConfig): ...@@ -52,7 +52,7 @@ class SqueezeBertConfig(PretrainedConfig):
The non-linear activation function (function or string) in the encoder and pooler. If string, The non-linear activation function (function or string) in the encoder and pooler. If string,
:obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported. :obj:`"gelu"`, :obj:`"relu"`, :obj:`"swish"` and :obj:`"gelu_new"` are supported.
hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.1): hidden_dropout_prob (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
attention_probs_dropout_prob (:obj:`float`, `optional`, defaults to 0.1): attention_probs_dropout_prob (:obj:`float`, `optional`, defaults to 0.1):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
max_position_embeddings (:obj:`int`, `optional`, defaults to 512): max_position_embeddings (:obj:`int`, `optional`, defaults to 512):
......
...@@ -77,7 +77,7 @@ class TransfoXLConfig(PretrainedConfig): ...@@ -77,7 +77,7 @@ class TransfoXLConfig(PretrainedConfig):
adaptive (:obj:`boolean`, `optional`, defaults to :obj:`True`): adaptive (:obj:`boolean`, `optional`, defaults to :obj:`True`):
Whether or not to use adaptive softmax. Whether or not to use adaptive softmax.
dropout (:obj:`float`, `optional`, defaults to 0.1): dropout (:obj:`float`, `optional`, defaults to 0.1):
The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler. The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
dropatt (:obj:`float`, `optional`, defaults to 0): dropatt (:obj:`float`, `optional`, defaults to 0):
The dropout ratio for the attention probabilities. The dropout ratio for the attention probabilities.
untie_r (:obj:`boolean`, `optional`, defaults to :obj:`True`): untie_r (:obj:`boolean`, `optional`, defaults to :obj:`True`):
......
...@@ -83,7 +83,7 @@ def generate_identified_filename(filename: Path, identifier: str) -> Path: ...@@ -83,7 +83,7 @@ def generate_identified_filename(filename: Path, identifier: str) -> Path:
filename: pathlib.Path The actual path object we would like to add an identifier suffix filename: pathlib.Path The actual path object we would like to add an identifier suffix
identifier: The suffix to add identifier: The suffix to add
Returns: String with concatenated indentifier at the end of the filename Returns: String with concatenated identifier at the end of the filename
""" """
return filename.parent.joinpath(filename.stem + identifier).with_suffix(filename.suffix) return filename.parent.joinpath(filename.stem + identifier).with_suffix(filename.suffix)
......
...@@ -30,7 +30,7 @@ class LightningModel(pl.LightningModule): ...@@ -30,7 +30,7 @@ class LightningModel(pl.LightningModule):
self.num_labels = 2 self.num_labels = 2
self.qa_outputs = torch.nn.Linear(self.model.config.hidden_size, self.num_labels) self.qa_outputs = torch.nn.Linear(self.model.config.hidden_size, self.num_labels)
# implement only because lighning requires to do so # implement only because lightning requires to do so
def forward(self): def forward(self):
pass pass
...@@ -57,7 +57,7 @@ def convert_longformer_qa_checkpoint_to_pytorch( ...@@ -57,7 +57,7 @@ def convert_longformer_qa_checkpoint_to_pytorch(
# save model # save model
longformer_for_qa.save_pretrained(pytorch_dump_folder_path) longformer_for_qa.save_pretrained(pytorch_dump_folder_path)
print("Conversion succesful. Model saved under {}".format(pytorch_dump_folder_path)) print("Conversion successful. Model saved under {}".format(pytorch_dump_folder_path))
if __name__ == "__main__": if __name__ == "__main__":
...@@ -75,7 +75,7 @@ if __name__ == "__main__": ...@@ -75,7 +75,7 @@ if __name__ == "__main__":
default=None, default=None,
type=str, type=str,
required=True, required=True,
help="Path the official PyTorch Lighning Checkpoint.", help="Path the official PyTorch Lightning Checkpoint.",
) )
parser.add_argument( parser.add_argument(
"--pytorch_dump_folder_path", default=None, type=str, required=True, help="Path to the output PyTorch model." "--pytorch_dump_folder_path", default=None, type=str, required=True, help="Path to the output PyTorch model."
......
...@@ -34,7 +34,7 @@ class TatoebaConverter: ...@@ -34,7 +34,7 @@ class TatoebaConverter:
1. convert numpy state dict to hf format (same code as OPUS-MT-Train conversion). 1. convert numpy state dict to hf format (same code as OPUS-MT-Train conversion).
2. rename opus model to huggingface format. This means replace each alpha3 code with an alpha2 code if a unique 2. rename opus model to huggingface format. This means replace each alpha3 code with an alpha2 code if a unique
one existes. e.g. aav-eng -> aav-en, heb-eng -> he-en one exists. e.g. aav-eng -> aav-en, heb-eng -> he-en
3. write a model card containing the original Tatoeba-Challenge/README.md and extra info about alpha3 group 3. write a model card containing the original Tatoeba-Challenge/README.md and extra info about alpha3 group
members. members.
""" """
......
...@@ -123,7 +123,7 @@ if __name__ == "__main__": ...@@ -123,7 +123,7 @@ if __name__ == "__main__":
parser.add_argument( parser.add_argument(
"--force_download", "--force_download",
action="store_true", action="store_true",
help="Re-dowload checkpoints.", help="Re-download checkpoints.",
) )
args = parser.parse_args() args = parser.parse_args()
......
...@@ -104,7 +104,7 @@ if __name__ == "__main__": ...@@ -104,7 +104,7 @@ if __name__ == "__main__":
"--finetuning_task", "--finetuning_task",
default=None, default=None,
type=str, type=str,
help="Name of a task on which the XLNet TensorFloaw model was fine-tuned", help="Name of a task on which the XLNet TensorFlow model was fine-tuned",
) )
args = parser.parse_args() args = parser.parse_args()
print(args) print(args)
......
...@@ -330,7 +330,7 @@ class DataCollatorForSOP(DataCollatorForLanguageModeling): ...@@ -330,7 +330,7 @@ class DataCollatorForSOP(DataCollatorForLanguageModeling):
input_ids, labels, attention_mask = self.mask_tokens(input_ids) input_ids, labels, attention_mask = self.mask_tokens(input_ids)
token_type_ids = [example["token_type_ids"] for example in examples] token_type_ids = [example["token_type_ids"] for example in examples]
# size of segment_ids varied because randomness, padding zero to the end as the orignal implementation # size of segment_ids varied because randomness, padding zero to the end as the original implementation
token_type_ids = pad_sequence(token_type_ids, batch_first=True, padding_value=self.tokenizer.pad_token_id) token_type_ids = pad_sequence(token_type_ids, batch_first=True, padding_value=self.tokenizer.pad_token_id)
sop_label_list = [example["sentence_order_label"] for example in examples] sop_label_list = [example["sentence_order_label"] for example in examples]
......
...@@ -71,7 +71,7 @@ class TextDataset(Dataset): ...@@ -71,7 +71,7 @@ class TextDataset(Dataset):
tokenizer.build_inputs_with_special_tokens(tokenized_text[i : i + block_size]) tokenizer.build_inputs_with_special_tokens(tokenized_text[i : i + block_size])
) )
# Note that we are losing the last truncated example here for the sake of simplicity (no padding) # Note that we are losing the last truncated example here for the sake of simplicity (no padding)
# If your dataset is small, first you should loook for a bigger one :-) and second you # If your dataset is small, first you should look for a bigger one :-) and second you
# can change this behavior by adding (model specific) padding. # can change this behavior by adding (model specific) padding.
start = time.time() start = time.time()
......
...@@ -327,7 +327,7 @@ def squad_convert_examples_to_features( ...@@ -327,7 +327,7 @@ def squad_convert_examples_to_features(
padding_strategy: Default to "max_length". Which padding strategy to use padding_strategy: Default to "max_length". Which padding strategy to use
return_dataset: Default False. Either 'pt' or 'tf'. return_dataset: Default False. Either 'pt' or 'tf'.
if 'pt': returns a torch.data.TensorDataset, if 'tf': returns a tf.data.Dataset if 'pt': returns a torch.data.TensorDataset, if 'tf': returns a tf.data.Dataset
threads: multiple processing threadsa-smi threads: multiple processing threads.
Returns: Returns:
...@@ -527,7 +527,7 @@ def squad_convert_examples_to_features( ...@@ -527,7 +527,7 @@ def squad_convert_examples_to_features(
class SquadProcessor(DataProcessor): class SquadProcessor(DataProcessor):
""" """
Processor for the SQuAD data set. Overriden by SquadV1Processor and SquadV2Processor, used by the version 1.1 and Processor for the SQuAD data set. overridden by SquadV1Processor and SquadV2Processor, used by the version 1.1 and
version 2.0 of SQuAD, respectively. version 2.0 of SQuAD, respectively.
""" """
......
...@@ -245,9 +245,6 @@ class SingleSentenceClassificationProcessor(DataProcessor): ...@@ -245,9 +245,6 @@ class SingleSentenceClassificationProcessor(DataProcessor):
Args: Args:
tokenizer: Instance of a tokenizer that will tokenize the examples tokenizer: Instance of a tokenizer that will tokenize the examples
max_length: Maximum example length max_length: Maximum example length
task: GLUE task
label_list: List of labels. Can be obtained from the processor using the ``processor.get_labels()`` method
output_mode: String indicating the output mode. Either ``regression`` or ``classification``
pad_on_left: If set to ``True``, the examples will be padded on the left rather than on the right (default) pad_on_left: If set to ``True``, the examples will be padded on the left rather than on the right (default)
pad_token: Padding token pad_token: Padding token
mask_padding_with_zero: If set to ``True``, the attention mask will be filled by ``1`` for actual values mask_padding_with_zero: If set to ``True``, the attention mask will be filled by ``1`` for actual values
......
...@@ -89,7 +89,7 @@ try: ...@@ -89,7 +89,7 @@ try:
# Check we're not importing a "datasets" directory somewhere # Check we're not importing a "datasets" directory somewhere
_datasets_available = hasattr(datasets, "__version__") and hasattr(datasets, "load_dataset") _datasets_available = hasattr(datasets, "__version__") and hasattr(datasets, "load_dataset")
if _datasets_available: if _datasets_available:
logger.debug(f"Succesfully imported datasets version {datasets.__version__}") logger.debug(f"Successfully imported datasets version {datasets.__version__}")
else: else:
logger.debug("Imported a datasets object but this doesn't seem to be the 🤗 datasets library.") logger.debug("Imported a datasets object but this doesn't seem to be the 🤗 datasets library.")
...@@ -147,7 +147,7 @@ try: ...@@ -147,7 +147,7 @@ try:
import faiss # noqa: F401 import faiss # noqa: F401
_faiss_available = True _faiss_available = True
logger.debug(f"Succesfully imported faiss version {faiss.__version__}") logger.debug(f"Successfully imported faiss version {faiss.__version__}")
except ImportError: except ImportError:
_faiss_available = False _faiss_available = False
...@@ -290,7 +290,7 @@ def torch_only_method(fn): ...@@ -290,7 +290,7 @@ def torch_only_method(fn):
# docstyle-ignore # docstyle-ignore
DATASETS_IMPORT_ERROR = """ DATASETS_IMPORT_ERROR = """
{0} requires the 🤗 Datasets library but it was not found in your enviromnent. You can install it with: {0} requires the 🤗 Datasets library but it was not found in your environment. You can install it with:
``` ```
pip install datasets pip install datasets
``` ```
...@@ -308,7 +308,7 @@ that python file if that's the case. ...@@ -308,7 +308,7 @@ that python file if that's the case.
# docstyle-ignore # docstyle-ignore
TOKENIZERS_IMPORT_ERROR = """ TOKENIZERS_IMPORT_ERROR = """
{0} requires the 🤗 Tokenizers library but it was not found in your enviromnent. You can install it with: {0} requires the 🤗 Tokenizers library but it was not found in your environment. You can install it with:
``` ```
pip install tokenizers pip install tokenizers
``` ```
...@@ -321,30 +321,30 @@ In a notebook or a colab, you can install it by executing a cell with ...@@ -321,30 +321,30 @@ In a notebook or a colab, you can install it by executing a cell with
# docstyle-ignore # docstyle-ignore
SENTENCEPIECE_IMPORT_ERROR = """ SENTENCEPIECE_IMPORT_ERROR = """
{0} requires the SentencePiece library but it was not found in your enviromnent. Checkout the instructions on the {0} requires the SentencePiece library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/google/sentencepiece#installation and follow the ones installation page of its repo: https://github.com/google/sentencepiece#installation and follow the ones
that match your enviromnent. that match your environment.
""" """
# docstyle-ignore # docstyle-ignore
FAISS_IMPORT_ERROR = """ FAISS_IMPORT_ERROR = """
{0} requires the faiss library but it was not found in your enviromnent. Checkout the instructions on the {0} requires the faiss library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/facebookresearch/faiss/blob/master/INSTALL.md and follow the ones installation page of its repo: https://github.com/facebookresearch/faiss/blob/master/INSTALL.md and follow the ones
that match your enviromnent. that match your environment.
""" """
# docstyle-ignore # docstyle-ignore
PYTORCH_IMPORT_ERROR = """ PYTORCH_IMPORT_ERROR = """
{0} requires the PyTorch library but it was not found in your enviromnent. Checkout the instructions on the {0} requires the PyTorch library but it was not found in your environment. Checkout the instructions on the
installation page: https://pytorch.org/get-started/locally/ and follow the ones that match your enviromnent. installation page: https://pytorch.org/get-started/locally/ and follow the ones that match your environment.
""" """
# docstyle-ignore # docstyle-ignore
SKLEARN_IMPORT_ERROR = """ SKLEARN_IMPORT_ERROR = """
{0} requires the scikit-learn library but it was not found in your enviromnent. You can install it with: {0} requires the scikit-learn library but it was not found in your environment. You can install it with:
``` ```
pip install -U scikit-learn pip install -U scikit-learn
``` ```
...@@ -357,15 +357,15 @@ In a notebook or a colab, you can install it by executing a cell with ...@@ -357,15 +357,15 @@ In a notebook or a colab, you can install it by executing a cell with
# docstyle-ignore # docstyle-ignore
TENSORFLOW_IMPORT_ERROR = """ TENSORFLOW_IMPORT_ERROR = """
{0} requires the TensorFlow library but it was not found in your enviromnent. Checkout the instructions on the {0} requires the TensorFlow library but it was not found in your environment. Checkout the instructions on the
installation page: https://www.tensorflow.org/install and follow the ones that match your enviromnent. installation page: https://www.tensorflow.org/install and follow the ones that match your environment.
""" """
# docstyle-ignore # docstyle-ignore
FLAX_IMPORT_ERROR = """ FLAX_IMPORT_ERROR = """
{0} requires the FLAX library but it was not found in your enviromnent. Checkout the instructions on the {0} requires the FLAX library but it was not found in your environment. Checkout the instructions on the
installation page: https://github.com/google/flax and follow the ones that match your enviromnent. installation page: https://github.com/google/flax and follow the ones that match your environment.
""" """
...@@ -918,13 +918,13 @@ def cached_path( ...@@ -918,13 +918,13 @@ def cached_path(
Args: Args:
cache_dir: specify a cache directory to save the file to (overwrite the default cache dir). cache_dir: specify a cache directory to save the file to (overwrite the default cache dir).
force_download: if True, re-dowload the file even if it's already cached in the cache dir. force_download: if True, re-download the file even if it's already cached in the cache dir.
resume_download: if True, resume the download if incompletly recieved file is found. resume_download: if True, resume the download if incompletely received file is found.
user_agent: Optional string or dict that will be appended to the user-agent on remote requests. user_agent: Optional string or dict that will be appended to the user-agent on remote requests.
extract_compressed_file: if True and the path point to a zip or tar file, extract the compressed extract_compressed_file: if True and the path point to a zip or tar file, extract the compressed
file in a folder along the archive. file in a folder along the archive.
force_extract: if True when extract_compressed_file is True and the archive was already extracted, force_extract: if True when extract_compressed_file is True and the archive was already extracted,
re-extract the archive and overide the folder where it was extracted. re-extract the archive and override the folder where it was extracted.
Return: Return:
None in case of non-recoverable file (non-existent or inaccessible url + no cache on disk). Local path (string) None in case of non-recoverable file (non-existent or inaccessible url + no cache on disk). Local path (string)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment