Commit 28e608a2 authored by Aymeric Augustin's avatar Aymeric Augustin
Browse files

Remove trailing whitespace from all Python files.

Fixes flake8 warning W291 (x224).
parent 1efa0a75
......@@ -111,11 +111,11 @@ ROBERTA_START_DOCSTRING = r""" The RoBERTa model was proposed in
`RoBERTa: A Robustly Optimized BERT Pretraining Approach`_
by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer,
Veselin Stoyanov. It is based on Google's BERT model released in 2018.
It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining
objective and training with much larger mini-batches and learning rates.
This implementation is the same as BertModel with a tiny embeddings tweak as well as a setup for Roberta pretrained
This implementation is the same as BertModel with a tiny embeddings tweak as well as a setup for Roberta pretrained
models.
This model is a tf.keras.Model `tf.keras.Model`_ sub-class. Use it as a regular TF 2.0 Keras Model and
......@@ -144,7 +144,7 @@ ROBERTA_START_DOCSTRING = r""" The RoBERTa model was proposed in
`model({'input_ids': input_ids, 'token_type_ids': token_type_ids})`
Parameters:
config (:class:`~transformers.RobertaConfig`): Model configuration class with all the parameters of the
config (:class:`~transformers.RobertaConfig`): Model configuration class with all the parameters of the
model. Initializing with a config file does not load the weights associated with the model, only the configuration.
Check out the :meth:`~transformers.PreTrainedModel.from_pretrained` method to load the model weights.
"""
......@@ -163,7 +163,7 @@ ROBERTA_INPUTS_DOCSTRING = r"""
``tokens: <s> the dog is hairy . </s>``
Fully encoded sequences or sequence pairs can be obtained using the RobertaTokenizer.encode function with
Fully encoded sequences or sequence pairs can be obtained using the RobertaTokenizer.encode function with
the ``add_special_tokens`` parameter set to ``True``.
RoBERTa is a model with absolute position embeddings so it's usually advised to pad the inputs on
......@@ -351,7 +351,7 @@ class TFRobertaClassificationHead(tf.keras.layers.Layer):
@add_start_docstrings(
"""RoBERTa Model transformer with a sequence classification/regression head on top (a linear layer
"""RoBERTa Model transformer with a sequence classification/regression head on top (a linear layer
on top of the pooled output) e.g. for GLUE tasks. """,
ROBERTA_START_DOCSTRING,
ROBERTA_INPUTS_DOCSTRING,
......
......@@ -565,7 +565,7 @@ T5_START_DOCSTRING = r""" The T5 model was proposed in
`model({'input_ids': input_ids, 'token_type_ids': token_type_ids})`
Parameters:
config (:class:`~transformers.T5Config`): Model configuration class with all the parameters of the model.
config (:class:`~transformers.T5Config`): Model configuration class with all the parameters of the model.
Initializing with a config file does not load the weights associated with the model, only the configuration.
Check out the :meth:`~transformers.PreTrainedModel.from_pretrained` method to load the model weights.
"""
......
......@@ -139,7 +139,7 @@ class TFPreTrainedModel(tf.keras.Model):
Arguments:
new_num_tokens: (`optional`) int:
New number of tokens in the embedding matrix. Increasing the size will add newly initialized vectors at the end. Reducing the size will remove vectors from the end.
New number of tokens in the embedding matrix. Increasing the size will add newly initialized vectors at the end. Reducing the size will remove vectors from the end.
If not provided or None: does nothing and just returns a pointer to the input tokens ``tf.Variable`` Module of the model.
Return: ``tf.Variable``
......@@ -431,7 +431,7 @@ class TFSharedEmbeddings(tf.keras.layers.Layer):
linear tensor, float32 with shape [batch_size, length, vocab_size].
Raises:
ValueError: if mode is not valid.
Shared weights logic adapted from
https://github.com/tensorflow/models/blob/a009f4fb9d2fc4949e32192a944688925ef78659/official/transformer/v2/embedding_layer.py#L24
"""
......
......@@ -825,7 +825,7 @@ class XLMForQuestionAnsweringSimple(XLMPreTrainedModel):
**cls_index**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size,)``:
Labels for position (index) of the classification token to use as input for computing plausibility of the answer.
**p_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
Optional mask of tokens which can't be in answers (e.g. [CLS], [PAD], ...)
Optional mask of tokens which can't be in answers (e.g. [CLS], [PAD], ...)
Outputs: `Tuple` comprising various elements depending on the configuration (config) and inputs:
**loss**: (`optional`, returned when ``labels`` is provided) ``torch.FloatTensor`` of shape ``(1,)``:
......@@ -942,7 +942,7 @@ class XLMForQuestionAnswering(XLMPreTrainedModel):
**cls_index**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size,)``:
Labels for position (index) of the classification token to use as input for computing plausibility of the answer.
**p_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
Optional mask of tokens which can't be in answers (e.g. [CLS], [PAD], ...)
Optional mask of tokens which can't be in answers (e.g. [CLS], [PAD], ...)
Outputs: `Tuple` comprising various elements depending on the configuration (config) and inputs:
**loss**: (`optional`, returned when ``labels`` is provided) ``torch.FloatTensor`` of shape ``(1,)``:
......
......@@ -45,7 +45,7 @@ XLM_ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP = {
XLM_ROBERTA_START_DOCSTRING = r""" The XLM-RoBERTa model was proposed in
`Unsupervised Cross-lingual Representation Learning at Scale`_
by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019.
It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data.
This implementation is the same as RoBERTa.
......@@ -60,7 +60,7 @@ XLM_ROBERTA_START_DOCSTRING = r""" The XLM-RoBERTa model was proposed in
https://pytorch.org/docs/stable/nn.html#module
Parameters:
config (:class:`~transformers.XLMRobertaConfig`): Model configuration class with all the parameters of the
config (:class:`~transformers.XLMRobertaConfig`): Model configuration class with all the parameters of the
model. Initializing with a config file does not load the weights associated with the model, only the configuration.
Check out the :meth:`~transformers.PreTrainedModel.from_pretrained` method to load the model weights.
"""
......@@ -79,7 +79,7 @@ XLM_ROBERTA_INPUTS_DOCSTRING = r"""
``tokens: <s> the dog is hairy . </s>``
Fully encoded sequences or sequence pairs can be obtained using the XLMRobertaTokenizer.encode function with
Fully encoded sequences or sequence pairs can be obtained using the XLMRobertaTokenizer.encode function with
the ``add_special_tokens`` parameter set to ``True``.
XLM-RoBERTa is a model with absolute position embeddings so it's usually advised to pad the inputs on
......@@ -204,7 +204,7 @@ class XLMRobertaForMaskedLM(RobertaForMaskedLM):
@add_start_docstrings(
"""XLM-RoBERTa Model transformer with a sequence classification/regression head on top (a linear layer
"""XLM-RoBERTa Model transformer with a sequence classification/regression head on top (a linear layer
on top of the pooled output) e.g. for GLUE tasks. """,
XLM_ROBERTA_START_DOCSTRING,
XLM_ROBERTA_INPUTS_DOCSTRING,
......
......@@ -868,7 +868,7 @@ class PreTrainedTokenizer(object):
padding index, up to their max length. If no max length is specified, the padding is done up to the model's max length.
The tokenizer padding sides are handled by the following strings:
- 'left': pads on the left of the sequences
- 'right': pads on the right of the sequences
- 'right': pads on the right of the sequences
Defaults to False: no padding.
return_tensors: (optional) can be set to 'tf' or 'pt' to return respectively TensorFlow tf.constant
or PyTorch torch.Tensor instead of a list of python integers.
......@@ -1073,7 +1073,7 @@ class PreTrainedTokenizer(object):
padding index, up to their max length. If no max length is specified, the padding is done up to the model's max length.
The tokenizer padding sides are handled by the following strings:
- 'left': pads on the left of the sequences
- 'right': pads on the right of the sequences
- 'right': pads on the right of the sequences
Defaults to False: no padding.
return_tensors: (optional) can be set to 'tf' or 'pt' to return respectively TensorFlow tf.constant
or PyTorch torch.Tensor instead of a list of python integers.
......
......@@ -2,7 +2,7 @@
Original source: https://gist.github.com/W4ngatang/60c2bdb54d156a41194446737ce03e2e
Note: for legal reasons, we are unable to host MRPC.
You can either use the version hosted by the SentEval team, which is already tokenized,
You can either use the version hosted by the SentEval team, which is already tokenized,
or you can download the original data from (https://download.microsoft.com/download/D/4/6/D46FF87A-F6B9-4252-AA8B-3604ED519838/MSRParaphraseCorpus.msi) and extract the data from it manually.
For Windows users, you can run the .msi file. For Mac and Linux users, consider an external library such as 'cabextract' (see below for an example).
You should then rename and place specific files in a folder (see below for an example).
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment