Unverified Commit 33f6ef73 authored by Lysandre Debut's avatar Lysandre Debut Committed by GitHub
Browse files

Fix DeBERTa docs (#8092)

* Fix DeBERTa docs

* Tokenizer and config
parent c42596bc
...@@ -28,8 +28,14 @@ DEBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = { ...@@ -28,8 +28,14 @@ DEBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = {
class DebertaConfig(PretrainedConfig): class DebertaConfig(PretrainedConfig):
r""" r"""
:class:`~transformers.DebertaConfig` is the configuration class to store the configuration of a This is the configuration class to store the configuration of a :class:`~transformers.DebertaModel` or a
:class:`~transformers.DebertaModel`. :class:`~transformers.TFDebertaModel`. It is used to instantiate a DeBERTa model according to the specified
arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar
configuration to that of the DeBERTa `microsoft/deberta-base <https://huggingface.co/microsoft/deberta-base>`__
architecture.
Configuration objects inherit from :class:`~transformers.PretrainedConfig` and can be used to control the model
outputs. Read the documentation from :class:`~transformers.PretrainedConfig` for more information.
Arguments: Arguments:
vocab_size (:obj:`int`, `optional`, defaults to 30522): vocab_size (:obj:`int`, `optional`, defaults to 30522):
......
...@@ -797,13 +797,18 @@ DEBERTA_INPUTS_DOCSTRING = r""" ...@@ -797,13 +797,18 @@ DEBERTA_INPUTS_DOCSTRING = r"""
`What are input IDs? <../glossary.html#input-ids>`__ `What are input IDs? <../glossary.html#input-ids>`__
attention_mask (:obj:`torch.FloatTensor` of shape :obj:`{0}`, `optional`): attention_mask (:obj:`torch.FloatTensor` of shape :obj:`{0}`, `optional`):
Mask to avoid performing attention on padding token indices. Mask values selected in ``[0, 1]``: ``1`` for Mask to avoid performing attention on padding token indices. Mask values selected in ``[0, 1]``:
tokens that are NOT MASKED, ``0`` for MASKED tokens.
- 1 for tokens that are **not masked**,
- 0 for tokens that are **masked**.
`What are attention masks? <../glossary.html#attention-mask>`__ `What are attention masks? <../glossary.html#attention-mask>`__
token_type_ids (:obj:`torch.LongTensor` of shape :obj:`{0}`, `optional`): token_type_ids (:obj:`torch.LongTensor` of shape :obj:`{0}`, `optional`):
Segment token indices to indicate first and second portions of the inputs. Indices are selected in ``[0, Segment token indices to indicate first and second portions of the inputs. Indices are selected in ``[0,
1]``: ``0`` corresponds to a `sentence A` token, ``1`` corresponds to a `sentence B` token 1]``:
- 0 corresponds to a `sentence A` token,
- 1 corresponds to a `sentence B` token.
`What are token type IDs? <../glossary.html#token-type-ids>`_ `What are token type IDs? <../glossary.html#token-type-ids>`_
position_ids (:obj:`torch.LongTensor` of shape :obj:`{0}`, `optional`): position_ids (:obj:`torch.LongTensor` of shape :obj:`{0}`, `optional`):
...@@ -816,14 +821,13 @@ DEBERTA_INPUTS_DOCSTRING = r""" ...@@ -816,14 +821,13 @@ DEBERTA_INPUTS_DOCSTRING = r"""
This is useful if you want more control over how to convert `input_ids` indices into associated vectors This is useful if you want more control over how to convert `input_ids` indices into associated vectors
than the model's internal embedding lookup matrix. than the model's internal embedding lookup matrix.
output_attentions (:obj:`bool`, `optional`): output_attentions (:obj:`bool`, `optional`):
If set to ``True``, the attentions tensors of all attention layers are returned. See ``attentions`` under Whether or not to return the attentions tensors of all attention layers. See ``attentions`` under returned
returned tensors for more detail.
output_hidden_states (:obj:`bool`, `optional`):
If set to ``True``, the hidden states of all layers are returned. See ``hidden_states`` under returned
tensors for more detail. tensors for more detail.
output_hidden_states (:obj:`bool`, `optional`):
Whether or not to return the hidden states of all layers. See ``hidden_states`` under returned tensors for
more detail.
return_dict (:obj:`bool`, `optional`): return_dict (:obj:`bool`, `optional`):
If set to ``True``, the model will return a :class:`~transformers.file_utils.ModelOutput` instead of a Whether or not to return a :class:`~transformers.file_utils.ModelOutput` instead of a plain tuple.
plain tuple.
""" """
......
...@@ -581,7 +581,7 @@ class DebertaTokenizer(PreTrainedTokenizer): ...@@ -581,7 +581,7 @@ class DebertaTokenizer(PreTrainedTokenizer):
def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None): def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None):
""" """
Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and
adding special tokens. A BERT sequence has the following format: adding special tokens. A DeBERTa sequence has the following format:
- single sequence: [CLS] X [SEP] - single sequence: [CLS] X [SEP]
- pair of sequences: [CLS] A [SEP] B [SEP] - pair of sequences: [CLS] A [SEP] B [SEP]
...@@ -608,14 +608,15 @@ class DebertaTokenizer(PreTrainedTokenizer): ...@@ -608,14 +608,15 @@ class DebertaTokenizer(PreTrainedTokenizer):
special tokens using the tokenizer ``prepare_for_model`` or ``encode_plus`` methods. special tokens using the tokenizer ``prepare_for_model`` or ``encode_plus`` methods.
Args: Args:
token_ids_0: list of ids (must not contain special tokens) token_ids_0 (:obj:`List[int]`):
token_ids_1: Optional list of ids (must not contain special tokens), necessary when fetching sequence ids List of IDs.
for sequence pairs token_ids_1 (:obj:`List[int]`, `optional`):
already_has_special_tokens: (default False) Set to True if the token list is already formated with Optional second list of IDs for sequence pairs.
special tokens for the model already_has_special_tokens (:obj:`bool`, `optional`, defaults to :obj:`False`):
Whether or not the token list is already formatted with special tokens for the model.
Returns: Returns:
A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token. :obj:`List[int]`: A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.
""" """
if already_has_special_tokens: if already_has_special_tokens:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment