Commit 1dc43e56 authored by LysandreJik's avatar LysandreJik
Browse files

Documentation additions

parent 912a377e
...@@ -48,3 +48,4 @@ The library currently contains PyTorch implementations, pre-trained model weight ...@@ -48,3 +48,4 @@ The library currently contains PyTorch implementations, pre-trained model weight
model_doc/xlm model_doc/xlm
model_doc/xlnet model_doc/xlnet
model_doc/roberta model_doc/roberta
model_doc/distilbert
DistilBERT
----------------------------------------------------
``DistilBertConfig``
~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertConfig
:members:
``DistilBertTokenizer``
~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertTokenizer
:members:
``DistilBertModel``
~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertModel
:members:
``DistilBertForMaskedLM``
~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertForMaskedLM
:members:
``DistilBertForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertForSequenceClassification
:members:
``DistilBertForQuestionAnswering``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: pytorch_transformers.DistilBertForQuestionAnswering
:members:
...@@ -111,5 +111,13 @@ Here is the full list of the currently provided pretrained models together with ...@@ -111,5 +111,13 @@ Here is the full list of the currently provided pretrained models together with
| | | | ``roberta-large`` fine-tuned on `MNLI <http://www.nyu.edu/projects/bowman/multinli/>`__. | | | | | ``roberta-large`` fine-tuned on `MNLI <http://www.nyu.edu/projects/bowman/multinli/>`__. |
| | | (see `details <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`__) | | | | (see `details <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`__) |
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ +-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| DistilBERT | ``distilbert-base-uncased`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint |
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| | ``distilbert-base-uncased-distilled-squad`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint, with an additional linear layer. |
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
.. <https://huggingface.co/pytorch-transformers/examples.html>`__ .. <https://huggingface.co/pytorch-transformers/examples.html>`__
\ No newline at end of file
...@@ -433,7 +433,7 @@ DISTILBERT_START_DOCSTRING = r""" ...@@ -433,7 +433,7 @@ DISTILBERT_START_DOCSTRING = r"""
Here are the differences between the interface of Bert and DistilBert: Here are the differences between the interface of Bert and DistilBert:
- DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belong to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`) - DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belongs to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`)
- DistilBert doesn't have options to select the input positions (`position_ids` input). This could be added if necessary though, just let's us know if you need this option. - DistilBert doesn't have options to select the input positions (`position_ids` input). This could be added if necessary though, just let's us know if you need this option.
For more information on DistilBERT, please refer to our For more information on DistilBERT, please refer to our
...@@ -450,9 +450,9 @@ DISTILBERT_START_DOCSTRING = r""" ...@@ -450,9 +450,9 @@ DISTILBERT_START_DOCSTRING = r"""
DISTILBERT_INPUTS_DOCSTRING = r""" DISTILBERT_INPUTS_DOCSTRING = r"""
Inputs: Inputs:
**input_ids**L ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``: **input_ids** ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
Indices oof input sequence tokens in the vocabulary. Indices of input sequence tokens in the vocabulary.
The input sequences should start with `[CLS]` and `[SEP]` tokens. The input sequences should start with `[CLS]` and end with `[SEP]` tokens.
For now, ONLY BertTokenizer(`bert-base-uncased`) is supported and you should use this tokenizer when using DistilBERT. For now, ONLY BertTokenizer(`bert-base-uncased`) is supported and you should use this tokenizer when using DistilBERT.
**attention_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``: **attention_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment