Unverified Commit 3323146e authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Models doc (#7345)



* Clean up model documentation

* Formatting

* Preparation work

* Long lines

* Main work on rst files

* Cleanup all config files

* Syntax fix

* Clean all tokenizers

* Work on first models

* Models beginning

* FaluBERT

* All PyTorch models

* All models

* Long lines again

* Fixes

* More fixes

* Update docs/source/model_doc/bert.rst
Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/electra.rst
Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>

* Last fixes
Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
parent 58405a52
Benchmarks Benchmarks
========== =======================================================================================================================
Let's take a look at how 🤗 Transformer models can be benchmarked, best practices, and already available benchmarks. Let's take a look at how 🤗 Transformer models can be benchmarked, best practices, and already available benchmarks.
A notebook explaining in more detail how to benchmark 🤗 Transformer models can be found `here <https://github.com/huggingface/transformers/blob/master/notebooks/05-benchmark.ipynb>`__. A notebook explaining in more detail how to benchmark 🤗 Transformer models can be found `here <https://github.com/huggingface/transformers/blob/master/notebooks/05-benchmark.ipynb>`__.
How to benchmark 🤗 Transformer models How to benchmark 🤗 Transformer models
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The classes :class:`~transformers.PyTorchBenchmark` and :class:`~transformers.TensorFlowBenchmark` allow to flexibly benchmark 🤗 Transformer models. The classes :class:`~transformers.PyTorchBenchmark` and :class:`~transformers.TensorFlowBenchmark` allow to flexibly benchmark 🤗 Transformer models.
The benchmark classes allow us to measure the `peak memory usage` and `required time` for both The benchmark classes allow us to measure the `peak memory usage` and `required time` for both
...@@ -300,7 +300,7 @@ deciding for which configuration the model should be trained. ...@@ -300,7 +300,7 @@ deciding for which configuration the model should be trained.
Benchmark best practices Benchmark best practices
~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This section lists a couple of best practices one should be aware of when benchmarking a model. This section lists a couple of best practices one should be aware of when benchmarking a model.
...@@ -311,7 +311,7 @@ This section lists a couple of best practices one should be aware of when benchm ...@@ -311,7 +311,7 @@ This section lists a couple of best practices one should be aware of when benchm
Sharing your benchmark Sharing your benchmark
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Previously all available core models (10 at the time) have been benchmarked for `inference time`, across many different settings: using PyTorch, with Previously all available core models (10 at the time) have been benchmarked for `inference time`, across many different settings: using PyTorch, with
and without TorchScript, using TensorFlow, with and without XLA. All of those tests were done across CPUs (except for and without TorchScript, using TensorFlow, with and without XLA. All of those tests were done across CPUs (except for
......
BERTology BERTology
--------- -----------------------------------------------------------------------------------------------------------------------
There is a growing field of study concerned with investigating the inner working of large-scale transformers like BERT (that some call "BERTology"). Some good examples of this field are: There is a growing field of study concerned with investigating the inner working of large-scale transformers like BERT (that some call "BERTology"). Some good examples of this field are:
......
Converting Tensorflow Checkpoints Converting Tensorflow Checkpoints
================================================ =======================================================================================================================
A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models than be loaded using the ``from_pretrained`` methods of the library. A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models than be loaded using the ``from_pretrained`` methods of the library.
...@@ -10,7 +10,7 @@ A command-line interface is provided to convert original Bert/GPT/GPT-2/Transfor ...@@ -10,7 +10,7 @@ A command-line interface is provided to convert original Bert/GPT/GPT-2/Transfor
The documentation below reflects the **transformers-cli convert** command format. The documentation below reflects the **transformers-cli convert** command format.
BERT BERT
^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can convert any TensorFlow checkpoint for BERT (in particular `the pre-trained models released by Google <https://github.com/google-research/bert#pre-trained-models>`_\ ) in a PyTorch save file by using the `convert_bert_original_tf_checkpoint_to_pytorch.py <https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_ script. You can convert any TensorFlow checkpoint for BERT (in particular `the pre-trained models released by Google <https://github.com/google-research/bert#pre-trained-models>`_\ ) in a PyTorch save file by using the `convert_bert_original_tf_checkpoint_to_pytorch.py <https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_ script.
...@@ -34,7 +34,7 @@ Here is an example of the conversion process for a pre-trained ``BERT-Base Uncas ...@@ -34,7 +34,7 @@ Here is an example of the conversion process for a pre-trained ``BERT-Base Uncas
You can download Google's pre-trained models for the conversion `here <https://github.com/google-research/bert#pre-trained-models>`__. You can download Google's pre-trained models for the conversion `here <https://github.com/google-research/bert#pre-trained-models>`__.
ALBERT ALBERT
^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Convert TensorFlow model checkpoints of ALBERT to PyTorch using the `convert_albert_original_tf_checkpoint_to_pytorch.py <https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_ script. Convert TensorFlow model checkpoints of ALBERT to PyTorch using the `convert_albert_original_tf_checkpoint_to_pytorch.py <https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_ script.
...@@ -54,7 +54,7 @@ Here is an example of the conversion process for the pre-trained ``ALBERT Base`` ...@@ -54,7 +54,7 @@ Here is an example of the conversion process for the pre-trained ``ALBERT Base``
You can download Google's pre-trained models for the conversion `here <https://github.com/google-research/albert#pre-trained-models>`__. You can download Google's pre-trained models for the conversion `here <https://github.com/google-research/albert#pre-trained-models>`__.
OpenAI GPT OpenAI GPT
^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is an example of the conversion process for a pre-trained OpenAI GPT model, assuming that your NumPy checkpoint save as the same format than OpenAI pretrained model (see `here <https://github.com/openai/finetune-transformer-lm>`__\ ) Here is an example of the conversion process for a pre-trained OpenAI GPT model, assuming that your NumPy checkpoint save as the same format than OpenAI pretrained model (see `here <https://github.com/openai/finetune-transformer-lm>`__\ )
...@@ -70,7 +70,7 @@ Here is an example of the conversion process for a pre-trained OpenAI GPT model, ...@@ -70,7 +70,7 @@ Here is an example of the conversion process for a pre-trained OpenAI GPT model,
OpenAI GPT-2 OpenAI GPT-2
^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is an example of the conversion process for a pre-trained OpenAI GPT-2 model (see `here <https://github.com/openai/gpt-2>`__\ ) Here is an example of the conversion process for a pre-trained OpenAI GPT-2 model (see `here <https://github.com/openai/gpt-2>`__\ )
...@@ -85,7 +85,7 @@ Here is an example of the conversion process for a pre-trained OpenAI GPT-2 mode ...@@ -85,7 +85,7 @@ Here is an example of the conversion process for a pre-trained OpenAI GPT-2 mode
[--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK] [--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK]
Transformer-XL Transformer-XL
^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is an example of the conversion process for a pre-trained Transformer-XL model (see `here <https://github.com/kimiyoung/transformer-xl/tree/master/tf#obtain-and-evaluate-pretrained-sota-models>`__\ ) Here is an example of the conversion process for a pre-trained Transformer-XL model (see `here <https://github.com/kimiyoung/transformer-xl/tree/master/tf#obtain-and-evaluate-pretrained-sota-models>`__\ )
...@@ -101,7 +101,7 @@ Here is an example of the conversion process for a pre-trained Transformer-XL mo ...@@ -101,7 +101,7 @@ Here is an example of the conversion process for a pre-trained Transformer-XL mo
XLNet XLNet
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is an example of the conversion process for a pre-trained XLNet model: Here is an example of the conversion process for a pre-trained XLNet model:
...@@ -118,7 +118,7 @@ Here is an example of the conversion process for a pre-trained XLNet model: ...@@ -118,7 +118,7 @@ Here is an example of the conversion process for a pre-trained XLNet model:
XLM XLM
^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is an example of the conversion process for a pre-trained XLM model: Here is an example of the conversion process for a pre-trained XLM model:
......
Fine-tuning with custom datasets Fine-tuning with custom datasets
================================ =======================================================================================================================
.. note:: .. note::
...@@ -24,7 +24,7 @@ We include several examples, each of which demonstrates a different type of comm ...@@ -24,7 +24,7 @@ We include several examples, each of which demonstrates a different type of comm
.. _seq_imdb: .. _seq_imdb:
Sequence Classification with IMDb Reviews Sequence Classification with IMDb Reviews
----------------------------------------- -----------------------------------------------------------------------------------------------------------------------
.. note:: .. note::
...@@ -139,7 +139,7 @@ Now that our datasets our ready, we can fine-tune a model either with the 🤗 ...@@ -139,7 +139,7 @@ Now that our datasets our ready, we can fine-tune a model either with the 🤗
.. _ft_trainer: .. _ft_trainer:
Fine-tuning with Trainer Fine-tuning with Trainer
~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The steps above prepared the datasets in the way that the trainer is expected. Now all we need to do is create a The steps above prepared the datasets in the way that the trainer is expected. Now all we need to do is create a
model to fine-tune, define the :class:`~transformers.TrainingArguments`/:class:`~transformers.TFTrainingArguments` model to fine-tune, define the :class:`~transformers.TrainingArguments`/:class:`~transformers.TFTrainingArguments`
...@@ -200,7 +200,7 @@ and instantiate a :class:`~transformers.Trainer`/:class:`~transformers.TFTrainer ...@@ -200,7 +200,7 @@ and instantiate a :class:`~transformers.Trainer`/:class:`~transformers.TFTrainer
.. _ft_native: .. _ft_native:
Fine-tuning with native PyTorch/TensorFlow Fine-tuning with native PyTorch/TensorFlow
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We can also train use native PyTorch or TensorFlow: We can also train use native PyTorch or TensorFlow:
...@@ -244,7 +244,7 @@ We can also train use native PyTorch or TensorFlow: ...@@ -244,7 +244,7 @@ We can also train use native PyTorch or TensorFlow:
.. _tok_ner: .. _tok_ner:
Token Classification with W-NUT Emerging Entities Token Classification with W-NUT Emerging Entities
------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
.. note:: .. note::
...@@ -443,7 +443,7 @@ sequence classification example above. ...@@ -443,7 +443,7 @@ sequence classification example above.
.. _qa_squad: .. _qa_squad:
Question Answering with SQuAD 2.0 Question Answering with SQuAD 2.0
--------------------------------- -----------------------------------------------------------------------------------------------------------------------
.. note:: .. note::
...@@ -655,7 +655,7 @@ multiple model outputs. ...@@ -655,7 +655,7 @@ multiple model outputs.
.. _resources: .. _resources:
Additional Resources Additional Resources
-------------------- -----------------------------------------------------------------------------------------------------------------------
- `How to train a new language model from scratch using Transformers and Tokenizers - `How to train a new language model from scratch using Transformers and Tokenizers
<https://huggingface.co/blog/how-to-train>`_. Blog post showing the steps to load in Esperanto data and train a <https://huggingface.co/blog/how-to-train>`_. Blog post showing the steps to load in Esperanto data and train a
...@@ -666,7 +666,7 @@ Additional Resources ...@@ -666,7 +666,7 @@ Additional Resources
.. _nlplib: .. _nlplib:
Using the 🤗 NLP Datasets & Metrics library Using the 🤗 NLP Datasets & Metrics library
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This tutorial demonstrates how to read in datasets from various raw text formats and prepare them for training with This tutorial demonstrates how to read in datasets from various raw text formats and prepare them for training with
🤗 Transformers so that you can do the same thing with your own custom datasets. However, we recommend users use the 🤗 Transformers so that you can do the same thing with your own custom datasets. However, we recommend users use the
......
Glossary Glossary
^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
General terms General terms
------------- -----------------------------------------------------------------------------------------------------------------------
- autoencoding models: see MLM - autoencoding models: see MLM
- autoregressive models: see CLM - autoregressive models: see CLM
...@@ -27,7 +27,7 @@ General terms ...@@ -27,7 +27,7 @@ General terms
or a punctuation symbol. or a punctuation symbol.
Model inputs Model inputs
------------ -----------------------------------------------------------------------------------------------------------------------
Every model is different yet bears similarities with the others. Therefore most models use the same inputs, which are Every model is different yet bears similarities with the others. Therefore most models use the same inputs, which are
detailed here alongside usage examples. detailed here alongside usage examples.
...@@ -35,7 +35,7 @@ detailed here alongside usage examples. ...@@ -35,7 +35,7 @@ detailed here alongside usage examples.
.. _input-ids: .. _input-ids:
Input IDs Input IDs
~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The input ids are often the only required parameters to be passed to the model as input. *They are token indices, The input ids are often the only required parameters to be passed to the model as input. *They are token indices,
numerical representations of tokens building the sequences that will be used as input by the model*. numerical representations of tokens building the sequences that will be used as input by the model*.
...@@ -43,7 +43,7 @@ numerical representations of tokens building the sequences that will be used as ...@@ -43,7 +43,7 @@ numerical representations of tokens building the sequences that will be used as
Each tokenizer works differently but the underlying mechanism remains the same. Here's an example using the BERT Each tokenizer works differently but the underlying mechanism remains the same. Here's an example using the BERT
tokenizer, which is a `WordPiece <https://arxiv.org/pdf/1609.08144.pdf>`__ tokenizer: tokenizer, which is a `WordPiece <https://arxiv.org/pdf/1609.08144.pdf>`__ tokenizer:
:: .. code-block::
>>> from transformers import BertTokenizer >>> from transformers import BertTokenizer
>>> tokenizer = BertTokenizer.from_pretrained("bert-base-cased") >>> tokenizer = BertTokenizer.from_pretrained("bert-base-cased")
...@@ -52,7 +52,7 @@ tokenizer, which is a `WordPiece <https://arxiv.org/pdf/1609.08144.pdf>`__ token ...@@ -52,7 +52,7 @@ tokenizer, which is a `WordPiece <https://arxiv.org/pdf/1609.08144.pdf>`__ token
The tokenizer takes care of splitting the sequence into tokens available in the tokenizer vocabulary. The tokenizer takes care of splitting the sequence into tokens available in the tokenizer vocabulary.
:: .. code-block::
>>> tokenized_sequence = tokenizer.tokenize(sequence) >>> tokenized_sequence = tokenizer.tokenize(sequence)
...@@ -60,7 +60,7 @@ The tokens are either words or subwords. Here for instance, "VRAM" wasn't in the ...@@ -60,7 +60,7 @@ The tokens are either words or subwords. Here for instance, "VRAM" wasn't in the
in "V", "RA" and "M". To indicate those tokens are not separate words but parts of the same word, a double-hash prefix is in "V", "RA" and "M". To indicate those tokens are not separate words but parts of the same word, a double-hash prefix is
added for "RA" and "M": added for "RA" and "M":
:: .. code-block::
>>> print(tokenized_sequence) >>> print(tokenized_sequence)
['A', 'Titan', 'R', '##T', '##X', 'has', '24', '##GB', 'of', 'V', '##RA', '##M'] ['A', 'Titan', 'R', '##T', '##X', 'has', '24', '##GB', 'of', 'V', '##RA', '##M']
...@@ -69,14 +69,14 @@ These tokens can then be converted into IDs which are understandable by the mode ...@@ -69,14 +69,14 @@ These tokens can then be converted into IDs which are understandable by the mode
the sentence to the tokenizer, which leverages the Rust implementation of the sentence to the tokenizer, which leverages the Rust implementation of
`huggingface/tokenizers <https://github.com/huggingface/tokenizers>`__ for peak performance. `huggingface/tokenizers <https://github.com/huggingface/tokenizers>`__ for peak performance.
:: .. code-block::
>>> inputs = tokenizer(sequence) >>> inputs = tokenizer(sequence)
The tokenizer returns a dictionary with all the arguments necessary for its corresponding model to work properly. The The tokenizer returns a dictionary with all the arguments necessary for its corresponding model to work properly. The
token indices are under the key "input_ids": token indices are under the key "input_ids":
:: .. code-block::
>>> encoded_sequence = inputs["input_ids"] >>> encoded_sequence = inputs["input_ids"]
>>> print(encoded_sequence) >>> print(encoded_sequence)
...@@ -87,13 +87,13 @@ IDs the model sometimes uses. ...@@ -87,13 +87,13 @@ IDs the model sometimes uses.
If we decode the previous sequence of ids, If we decode the previous sequence of ids,
:: .. code-block::
>>> decoded_sequence = tokenizer.decode(encoded_sequence) >>> decoded_sequence = tokenizer.decode(encoded_sequence)
we will see we will see
:: .. code-block::
>>> print(decoded_sequence) >>> print(decoded_sequence)
[CLS] A Titan RTX has 24GB of VRAM [SEP] [CLS] A Titan RTX has 24GB of VRAM [SEP]
...@@ -103,14 +103,14 @@ because this is the way a :class:`~transformers.BertModel` is going to expect it ...@@ -103,14 +103,14 @@ because this is the way a :class:`~transformers.BertModel` is going to expect it
.. _attention-mask: .. _attention-mask:
Attention mask Attention mask
~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The attention mask is an optional argument used when batching sequences together. This argument indicates to the The attention mask is an optional argument used when batching sequences together. This argument indicates to the
model which tokens should be attended to, and which should not. model which tokens should be attended to, and which should not.
For example, consider these two sequences: For example, consider these two sequences:
:: .. code-block::
>>> from transformers import BertTokenizer >>> from transformers import BertTokenizer
>>> tokenizer = BertTokenizer.from_pretrained("bert-base-cased") >>> tokenizer = BertTokenizer.from_pretrained("bert-base-cased")
...@@ -123,7 +123,7 @@ For example, consider these two sequences: ...@@ -123,7 +123,7 @@ For example, consider these two sequences:
The encoded versions have different lengths: The encoded versions have different lengths:
:: .. code-block::
>>> len(encoded_sequence_a), len(encoded_sequence_b) >>> len(encoded_sequence_a), len(encoded_sequence_b)
(8, 19) (8, 19)
...@@ -134,13 +134,13 @@ of the second one, or the second one needs to be truncated down to the length of ...@@ -134,13 +134,13 @@ of the second one, or the second one needs to be truncated down to the length of
In the first case, the list of IDs will be extended by the padding indices. We can pass a list to the tokenizer and ask In the first case, the list of IDs will be extended by the padding indices. We can pass a list to the tokenizer and ask
it to pad like this: it to pad like this:
:: .. code-block::
>>> padded_sequences = tokenizer([sequence_a, sequence_b], padding=True) >>> padded_sequences = tokenizer([sequence_a, sequence_b], padding=True)
We can see that 0s have been added on the right of the first sentence to make it the same length as the second one: We can see that 0s have been added on the right of the first sentence to make it the same length as the second one:
:: .. code-block::
>>> padded_sequences["input_ids"] >>> padded_sequences["input_ids"]
[[101, 1188, 1110, 170, 1603, 4954, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [101, 1188, 1110, 170, 1897, 1263, 4954, 119, 1135, 1110, 1120, 1655, 2039, 1190, 1103, 4954, 138, 119, 102]] [[101, 1188, 1110, 170, 1603, 4954, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [101, 1188, 1110, 170, 1897, 1263, 4954, 119, 1135, 1110, 1120, 1655, 2039, 1190, 1103, 4954, 138, 119, 102]]
...@@ -150,7 +150,7 @@ the position of the padded indices so that the model does not attend to them. Fo ...@@ -150,7 +150,7 @@ the position of the padded indices so that the model does not attend to them. Fo
:class:`~transformers.BertTokenizer`, :obj:`1` indicates a value that should be attended to, while :obj:`0` indicates :class:`~transformers.BertTokenizer`, :obj:`1` indicates a value that should be attended to, while :obj:`0` indicates
a padded value. This attention mask is in the dictionary returned by the tokenizer under the key "attention_mask": a padded value. This attention mask is in the dictionary returned by the tokenizer under the key "attention_mask":
:: .. code-block::
>>> padded_sequences["attention_mask"] >>> padded_sequences["attention_mask"]
[[1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]] [[1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]
...@@ -158,20 +158,20 @@ a padded value. This attention mask is in the dictionary returned by the tokeniz ...@@ -158,20 +158,20 @@ a padded value. This attention mask is in the dictionary returned by the tokeniz
.. _token-type-ids: .. _token-type-ids:
Token Type IDs Token Type IDs
~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some models' purpose is to do sequence classification or question answering. These require two different sequences to Some models' purpose is to do sequence classification or question answering. These require two different sequences to
be joined in a single "input_ids" entry, which usually is performed with the help of special tokens, such as the classifier (``[CLS]``) and separator (``[SEP]``) be joined in a single "input_ids" entry, which usually is performed with the help of special tokens, such as the classifier (``[CLS]``) and separator (``[SEP]``)
tokens. For example, the BERT model builds its two sequence input as such: tokens. For example, the BERT model builds its two sequence input as such:
:: .. code-block::
>>> # [CLS] SEQUENCE_A [SEP] SEQUENCE_B [SEP] >>> # [CLS] SEQUENCE_A [SEP] SEQUENCE_B [SEP]
We can use our tokenizer to automatically generate such a sentence by passing the two sequences to ``tokenizer`` as two arguments (and We can use our tokenizer to automatically generate such a sentence by passing the two sequences to ``tokenizer`` as two arguments (and
not a list, like before) like this: not a list, like before) like this:
:: .. code-block::
>>> from transformers import BertTokenizer >>> from transformers import BertTokenizer
>>> tokenizer = BertTokenizer.from_pretrained("bert-base-cased") >>> tokenizer = BertTokenizer.from_pretrained("bert-base-cased")
...@@ -183,7 +183,7 @@ not a list, like before) like this: ...@@ -183,7 +183,7 @@ not a list, like before) like this:
which will return: which will return:
:: .. code-block::
>>> print(decoded) >>> print(decoded)
[CLS] HuggingFace is based in NYC [SEP] Where is HuggingFace based? [SEP] [CLS] HuggingFace is based in NYC [SEP] Where is HuggingFace based? [SEP]
...@@ -194,7 +194,7 @@ mask identifying the two types of sequence in the model. ...@@ -194,7 +194,7 @@ mask identifying the two types of sequence in the model.
The tokenizer returns this mask as the "token_type_ids" entry: The tokenizer returns this mask as the "token_type_ids" entry:
:: .. code-block::
>>> encoded_dict['token_type_ids'] >>> encoded_dict['token_type_ids']
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]
...@@ -207,7 +207,7 @@ Some models, like :class:`~transformers.XLNetModel` use an additional token repr ...@@ -207,7 +207,7 @@ Some models, like :class:`~transformers.XLNetModel` use an additional token repr
.. _position-ids: .. _position-ids:
Position IDs Position IDs
~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Contrary to RNNs that have the position of each token embedded within them, Contrary to RNNs that have the position of each token embedded within them,
transformers are unaware of the position of each token. Therefore, the position IDs (``position_ids``) are used by the model to identify each token's position in the list of tokens. transformers are unaware of the position of each token. Therefore, the position IDs (``position_ids``) are used by the model to identify each token's position in the list of tokens.
...@@ -221,7 +221,7 @@ use other types of positional embeddings, such as sinusoidal position embeddings ...@@ -221,7 +221,7 @@ use other types of positional embeddings, such as sinusoidal position embeddings
.. _feed-forward-chunking: .. _feed-forward-chunking:
Feed Forward Chunking Feed Forward Chunking
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In each residual attention block in transformers the self-attention layer is usually followed by 2 feed forward layers. In each residual attention block in transformers the self-attention layer is usually followed by 2 feed forward layers.
The intermediate embedding size of the feed forward layers is often bigger than the hidden size of the model (e.g., The intermediate embedding size of the feed forward layers is often bigger than the hidden size of the model (e.g.,
......
Transformers Transformers
================================================================================================================================================ =======================================================================================================================
State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0. State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
...@@ -11,7 +11,7 @@ TensorFlow 2.0 and PyTorch. ...@@ -11,7 +11,7 @@ TensorFlow 2.0 and PyTorch.
This is the documentation of our repository `transformers <https://github.com/huggingface/transformers>`_. This is the documentation of our repository `transformers <https://github.com/huggingface/transformers>`_.
Features Features
--------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
- High performance on NLU and NLG tasks - High performance on NLU and NLG tasks
- Low barrier to entry for educators and practitioners - Low barrier to entry for educators and practitioners
...@@ -36,7 +36,7 @@ Choose the right framework for every part of a model's lifetime: ...@@ -36,7 +36,7 @@ Choose the right framework for every part of a model's lifetime:
- Seamlessly pick the right framework for training, evaluation, production - Seamlessly pick the right framework for training, evaluation, production
Contents Contents
--------------------------------- -----------------------------------------------------------------------------------------------------------------------
The documentation is organized in five parts: The documentation is organized in five parts:
......
Custom Layers and Utilities Custom Layers and Utilities
--------------------------- -----------------------------------------------------------------------------------------------------------------------
This page lists all the custom layers used by the library, as well as the utility functions it provides for modeling. This page lists all the custom layers used by the library, as well as the utility functions it provides for modeling.
Most of those are only useful if you are studying the code of the models in the library. Most of those are only useful if you are studying the code of the models in the library.
``Pytorch custom modules`` Pytorch custom modules
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_utils.Conv1D .. autoclass:: transformers.modeling_utils.Conv1D
...@@ -29,8 +29,8 @@ Most of those are only useful if you are studying the code of the models in the ...@@ -29,8 +29,8 @@ Most of those are only useful if you are studying the code of the models in the
:members: forward :members: forward
``PyTorch Helper Functions`` PyTorch Helper Functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: transformers.apply_chunking_to_forward .. autofunction:: transformers.apply_chunking_to_forward
...@@ -42,8 +42,8 @@ Most of those are only useful if you are studying the code of the models in the ...@@ -42,8 +42,8 @@ Most of those are only useful if you are studying the code of the models in the
.. autofunction:: transformers.modeling_utils.prune_linear_layer .. autofunction:: transformers.modeling_utils.prune_linear_layer
``TensorFlow custom layers`` TensorFlow custom layers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_utils.TFConv1D .. autoclass:: transformers.modeling_tf_utils.TFConv1D
...@@ -54,8 +54,8 @@ Most of those are only useful if you are studying the code of the models in the ...@@ -54,8 +54,8 @@ Most of those are only useful if you are studying the code of the models in the
:members: call :members: call
``TensorFlow loss functions`` TensorFlow loss functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_utils.TFCausalLanguageModelingLoss .. autoclass:: transformers.modeling_tf_utils.TFCausalLanguageModelingLoss
:members: :members:
...@@ -76,8 +76,8 @@ Most of those are only useful if you are studying the code of the models in the ...@@ -76,8 +76,8 @@ Most of those are only useful if you are studying the code of the models in the
:members: :members:
``TensorFlow Helper Functions`` TensorFlow Helper Functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: transformers.modeling_tf_utils.cast_bool_to_primitive .. autofunction:: transformers.modeling_tf_utils.cast_bool_to_primitive
......
Utilities for pipelines Utilities for pipelines
----------------------- -----------------------------------------------------------------------------------------------------------------------
This page lists all the utility functions the library provides for pipelines. This page lists all the utility functions the library provides for pipelines.
Most of those are only useful if you are studying the code of the models in the library. Most of those are only useful if you are studying the code of the models in the library.
Argument handling Argument handling
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.pipelines.ArgumentHandler .. autoclass:: transformers.pipelines.ArgumentHandler
.. autoclass:: transformers.pipelines.ZeroShotClassificationArgumentHandler .. autoclass:: transformers.pipelines.ZeroShotClassificationArgumentHandler
.. autoclass:: transformers.pipelines.QuestionAnsweringArgumentHandler .. autoclass:: transformers.pipelines.QuestionAnsweringArgumentHandler
Data format Data format
~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.pipelines.PipelineDataFormat .. autoclass:: transformers.pipelines.PipelineDataFormat
:members: :members:
.. autoclass:: transformers.pipelines.CsvPipelineDataFormat .. autoclass:: transformers.pipelines.CsvPipelineDataFormat
:members: :members:
.. autoclass:: transformers.pipelines.JsonPipelineDataFormat .. autoclass:: transformers.pipelines.JsonPipelineDataFormat
:members: :members:
.. autoclass:: transformers.pipelines.PipedPipelineDataFormat .. autoclass:: transformers.pipelines.PipedPipelineDataFormat
:members: :members:
Utilities Utilities
~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: transformers.pipelines.get_framework .. autofunction:: transformers.pipelines.get_framework
.. autoclass:: transformers.pipelines.PipelineException .. autoclass:: transformers.pipelines.PipelineException
Utilities for Tokenizers Utilities for Tokenizers
------------------------ -----------------------------------------------------------------------------------------------------------------------
This page lists all the utility functions used by the tokenizers, mainly the class This page lists all the utility functions used by the tokenizers, mainly the class
:class:`~transformers.tokenization_utils_base.PreTrainedTokenizerBase` that implements the common methods between :class:`~transformers.tokenization_utils_base.PreTrainedTokenizerBase` that implements the common methods between
:class:`~transformers.PreTrainedTokenizer` and :class:`~transformers.PreTrainedTokenizerFast` and the mixin :class:`~transformers.PreTrainedTokenizer` and :class:`~transformers.PreTrainedTokenizerFast` and the mixin
:class:`~transformers.tokenization_utils_base.SpecialTokensMixin`. :class:`~transformers.tokenization_utils_base.SpecialTokensMixin`.
Most of those are only useful if you are studying the code of the tokenizers in the library. Most of those are only useful if you are studying the code of the tokenizers in the library.
``PreTrainedTokenizerBase`` PreTrainedTokenizerBase
~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.tokenization_utils_base.PreTrainedTokenizerBase .. autoclass:: transformers.tokenization_utils_base.PreTrainedTokenizerBase
:special-members: __call__ :special-members: __call__
:members: :members:
``SpecialTokensMixin`` SpecialTokensMixin
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.tokenization_utils_base.SpecialTokensMixin .. autoclass:: transformers.tokenization_utils_base.SpecialTokensMixin
:members: :members:
Enums and namedtuples Enums and namedtuples
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.tokenization_utils_base.ExplicitEnum .. autoclass:: transformers.tokenization_utils_base.ExplicitEnum
.. autoclass:: transformers.tokenization_utils_base.PaddingStrategy .. autoclass:: transformers.tokenization_utils_base.PaddingStrategy
.. autoclass:: transformers.tokenization_utils_base.TensorType .. autoclass:: transformers.tokenization_utils_base.TensorType
.. autoclass:: transformers.tokenization_utils_base.TruncationStrategy .. autoclass:: transformers.tokenization_utils_base.TruncationStrategy
.. autoclass:: transformers.tokenization_utils_base.CharSpan .. autoclass:: transformers.tokenization_utils_base.CharSpan
.. autoclass:: transformers.tokenization_utils_base.TokenSpan .. autoclass:: transformers.tokenization_utils_base.TokenSpan
Configuration Configuration
---------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
The base class :class:`~transformers.PretrainedConfig` implements the common methods for loading/saving a configuration The base class :class:`~transformers.PretrainedConfig` implements the common methods for loading/saving a configuration
either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded
...@@ -7,7 +7,7 @@ from HuggingFace's AWS S3 repository). ...@@ -7,7 +7,7 @@ from HuggingFace's AWS S3 repository).
PretrainedConfig PretrainedConfig
~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.PretrainedConfig .. autoclass:: transformers.PretrainedConfig
:members: :members:
Logging Logging
------- -----------------------------------------------------------------------------------------------------------------------
🤗 Transformers has a centralized logging system, so that you can setup the verbosity of the library easily. 🤗 Transformers has a centralized logging system, so that you can setup the verbosity of the library easily.
Currently the default verbosity of the library is ``WARNING``. Currently the default verbosity of the library is ``WARNING``.
To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity to the INFO level. To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity
to the INFO level.
.. code-block:: python .. code-block:: python
import transformers import transformers
transformers.logging.set_verbosity_info() transformers.logging.set_verbosity_info()
You can also use the environment variable ``TRANSFORMERS_VERBOSITY`` to override the default verbosity. You can set it to one of the following: ``debug``, ``info``, ``warning``, ``error``, ``critical``. For example: You can also use the environment variable ``TRANSFORMERS_VERBOSITY`` to override the default verbosity. You can set it
to one of the following: ``debug``, ``info``, ``warning``, ``error``, ``critical``. For example:
.. code-block:: bash .. code-block:: bash
...@@ -32,7 +34,7 @@ verbose to the most verbose), those levels (with their corresponding int values ...@@ -32,7 +34,7 @@ verbose to the most verbose), those levels (with their corresponding int values
- :obj:`transformers.logging.DEBUG` (int value, 10): report all information. - :obj:`transformers.logging.DEBUG` (int value, 10): report all information.
Base setters Base setters
~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: transformers.logging.set_verbosity_error .. autofunction:: transformers.logging.set_verbosity_error
...@@ -43,7 +45,7 @@ Base setters ...@@ -43,7 +45,7 @@ Base setters
.. autofunction:: transformers.logging.set_verbosity_debug .. autofunction:: transformers.logging.set_verbosity_debug
Other functions Other functions
~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autofunction:: transformers.logging.get_verbosity .. autofunction:: transformers.logging.get_verbosity
......
Models Models
---------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
The base classes :class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` implement the The base classes :class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` implement the
common methods for loading/saving a model either from a local file or directory, or from a pretrained model common methods for loading/saving a model either from a local file or directory, or from a pretrained model
...@@ -17,36 +17,36 @@ for text generation, :class:`~transformers.generation_utils.GenerationMixin` (fo ...@@ -17,36 +17,36 @@ for text generation, :class:`~transformers.generation_utils.GenerationMixin` (fo
:class:`~transformers.generation_tf_utils.TFGenerationMixin` (for the TensorFlow models) :class:`~transformers.generation_tf_utils.TFGenerationMixin` (for the TensorFlow models)
``PreTrainedModel`` PreTrainedModel
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.PreTrainedModel .. autoclass:: transformers.PreTrainedModel
:members: :members:
``ModuleUtilsMixin`` ModuleUtilsMixin
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_utils.ModuleUtilsMixin .. autoclass:: transformers.modeling_utils.ModuleUtilsMixin
:members: :members:
``TFPreTrainedModel`` TFPreTrainedModel
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFPreTrainedModel .. autoclass:: transformers.TFPreTrainedModel
:members: :members:
``TFModelUtilsMixin`` TFModelUtilsMixin
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_utils.TFModelUtilsMixin .. autoclass:: transformers.modeling_tf_utils.TFModelUtilsMixin
:members: :members:
Generative models Generative models
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.generation_utils.GenerationMixin .. autoclass:: transformers.generation_utils.GenerationMixin
:members: :members:
......
Optimization Optimization
---------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
The ``.optimization`` module provides: The ``.optimization`` module provides:
...@@ -7,29 +7,29 @@ The ``.optimization`` module provides: ...@@ -7,29 +7,29 @@ The ``.optimization`` module provides:
- several schedules in the form of schedule objects that inherit from ``_LRSchedule``: - several schedules in the form of schedule objects that inherit from ``_LRSchedule``:
- a gradient accumulation class to accumulate the gradients of multiple batches - a gradient accumulation class to accumulate the gradients of multiple batches
``AdamW`` (PyTorch) AdamW (PyTorch)
~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AdamW .. autoclass:: transformers.AdamW
:members: :members:
``AdaFactor`` (PyTorch) AdaFactor (PyTorch)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.Adafactor .. autoclass:: transformers.Adafactor
``AdamWeightDecay`` (TensorFlow) AdamWeightDecay (TensorFlow)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AdamWeightDecay .. autoclass:: transformers.AdamWeightDecay
.. autofunction:: transformers.create_optimizer .. autofunction:: transformers.create_optimizer
Schedules Schedules
~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Learning Rate Schedules (Pytorch) Learning Rate Schedules (Pytorch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autofunction:: transformers.get_constant_schedule .. autofunction:: transformers.get_constant_schedule
...@@ -62,16 +62,16 @@ Learning Rate Schedules (Pytorch) ...@@ -62,16 +62,16 @@ Learning Rate Schedules (Pytorch)
:target: /imgs/warmup_linear_schedule.png :target: /imgs/warmup_linear_schedule.png
:alt: :alt:
``Warmup`` (TensorFlow) Warmup (TensorFlow)
^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: transformers.WarmUp .. autoclass:: transformers.WarmUp
:members: :members:
Gradient Strategies Gradient Strategies
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``GradientAccumulator`` (TensorFlow) GradientAccumulator (TensorFlow)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: transformers.GradientAccumulator .. autoclass:: transformers.GradientAccumulator
Model outputs Model outputs
------------- -----------------------------------------------------------------------------------------------------------------------
PyTorch models have outputs that are instances of subclasses of :class:`~transformers.file_utils.ModelOutput`. Those PyTorch models have outputs that are instances of subclasses of :class:`~transformers.file_utils.ModelOutput`. Those
are data structures containing all the information returned by the model, but that can also be used as tuples or are data structures containing all the information returned by the model, but that can also be used as tuples or
...@@ -44,98 +44,217 @@ values. Here for instance, it has two keys that are ``loss`` and ``logits``. ...@@ -44,98 +44,217 @@ values. Here for instance, it has two keys that are ``loss`` and ``logits``.
We document here the generic model outputs that are used by more than one model type. Specific output types are We document here the generic model outputs that are used by more than one model type. Specific output types are
documented on their corresponding model page. documented on their corresponding model page.
``ModelOutput`` ModelOutput
~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.file_utils.ModelOutput .. autoclass:: transformers.file_utils.ModelOutput
:members: :members:
``BaseModelOutput``
~~~~~~~~~~~~~~~~~~~ BaseModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.BaseModelOutput .. autoclass:: transformers.modeling_outputs.BaseModelOutput
:members: :members:
``BaseModelOutputWithPooling``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ BaseModelOutputWithPooling
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPooling .. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPooling
:members: :members:
``BaseModelOutputWithPast``
~~~~~~~~~~~~~~~~~~~~~~~~~~~ BaseModelOutputWithPast
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPast .. autoclass:: transformers.modeling_outputs.BaseModelOutputWithPast
:members: :members:
``Seq2SeqModelOutput`` Seq2SeqModelOutput
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.Seq2SeqModelOutput .. autoclass:: transformers.modeling_outputs.Seq2SeqModelOutput
:members: :members:
``CausalLMOutput``
~~~~~~~~~~~~~~~~~~ CausalLMOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.CausalLMOutput .. autoclass:: transformers.modeling_outputs.CausalLMOutput
:members: :members:
``CausalLMOutputWithPast``
~~~~~~~~~~~~~~~~~~~~~~~~~~ CausalLMOutputWithPast
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.CausalLMOutputWithPast .. autoclass:: transformers.modeling_outputs.CausalLMOutputWithPast
:members: :members:
``MaskedLMOutput``
~~~~~~~~~~~~~~~~~~ MaskedLMOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.MaskedLMOutput .. autoclass:: transformers.modeling_outputs.MaskedLMOutput
:members: :members:
``Seq2SeqLMOutput``
~~~~~~~~~~~~~~~~~~~ Seq2SeqLMOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.Seq2SeqLMOutput .. autoclass:: transformers.modeling_outputs.Seq2SeqLMOutput
:members: :members:
``NextSentencePredictorOutput``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NextSentencePredictorOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.NextSentencePredictorOutput .. autoclass:: transformers.modeling_outputs.NextSentencePredictorOutput
:members: :members:
``SequenceClassifierOutput``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SequenceClassifierOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.SequenceClassifierOutput .. autoclass:: transformers.modeling_outputs.SequenceClassifierOutput
:members: :members:
``Seq2SeqSequenceClassifierOutput``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Seq2SeqSequenceClassifierOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.Seq2SeqSequenceClassifierOutput .. autoclass:: transformers.modeling_outputs.Seq2SeqSequenceClassifierOutput
:members: :members:
``MultipleChoiceModelOutput``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ MultipleChoiceModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.MultipleChoiceModelOutput .. autoclass:: transformers.modeling_outputs.MultipleChoiceModelOutput
:members: :members:
``TokenClassifierOutput``
~~~~~~~~~~~~~~~~~~~~~~~~~ TokenClassifierOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.TokenClassifierOutput .. autoclass:: transformers.modeling_outputs.TokenClassifierOutput
:members: :members:
``QuestionAnsweringModelOutput``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ QuestionAnsweringModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.QuestionAnsweringModelOutput .. autoclass:: transformers.modeling_outputs.QuestionAnsweringModelOutput
:members: :members:
``Seq2SeqQuestionAnsweringModelOutput``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Seq2SeqQuestionAnsweringModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_outputs.Seq2SeqQuestionAnsweringModelOutput .. autoclass:: transformers.modeling_outputs.Seq2SeqQuestionAnsweringModelOutput
:members: :members:
TFBaseModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutput
:members:
TFBaseModelOutputWithPooling
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutputWithPooling
:members:
TFBaseModelOutputWithPast
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFBaseModelOutputWithPast
:members:
TFSeq2SeqModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqModelOutput
:members:
TFCausalLMOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFCausalLMOutput
:members:
TFCausalLMOutputWithPast
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFCausalLMOutputWithPast
:members:
TFMaskedLMOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFMaskedLMOutput
:members:
TFSeq2SeqLMOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqLMOutput
:members:
TFNextSentencePredictorOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFNextSentencePredictorOutput
:members:
TFSequenceClassifierOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFSequenceClassifierOutput
:members:
TFSeq2SeqSequenceClassifierOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqSequenceClassifierOutput
:members:
TFMultipleChoiceModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFMultipleChoiceModelOutput
:members:
TFTokenClassifierOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFTokenClassifierOutput
:members:
TFQuestionAnsweringModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFQuestionAnsweringModelOutput
:members:
TFSeq2SeqQuestionAnsweringModelOutput
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_tf_outputs.TFSeq2SeqQuestionAnsweringModelOutput
:members:
Pipelines Pipelines
---------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most
of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity
...@@ -24,7 +24,7 @@ There are two categories of pipeline abstractions to be aware about: ...@@ -24,7 +24,7 @@ There are two categories of pipeline abstractions to be aware about:
- :class:`~transformers.Text2TextGenerationPipeline` - :class:`~transformers.Text2TextGenerationPipeline`
The pipeline abstraction The pipeline abstraction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The `pipeline` abstraction is a wrapper around all the other available pipelines. It is instantiated as any The `pipeline` abstraction is a wrapper around all the other available pipelines. It is instantiated as any
other pipeline but requires an additional argument which is the `task`. other pipeline but requires an additional argument which is the `task`.
...@@ -33,10 +33,10 @@ other pipeline but requires an additional argument which is the `task`. ...@@ -33,10 +33,10 @@ other pipeline but requires an additional argument which is the `task`.
The task specific pipelines The task specific pipelines
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ConversationalPipeline ConversationalPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.Conversation .. autoclass:: transformers.Conversation
...@@ -45,76 +45,76 @@ ConversationalPipeline ...@@ -45,76 +45,76 @@ ConversationalPipeline
:members: :members:
FeatureExtractionPipeline FeatureExtractionPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.FeatureExtractionPipeline .. autoclass:: transformers.FeatureExtractionPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
FillMaskPipeline FillMaskPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.FillMaskPipeline .. autoclass:: transformers.FillMaskPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
NerPipeline NerPipeline
========================================== =======================================================================================================================
This class is an alias of the :class:`~transformers.TokenClassificationPipeline` defined below. Please refer to that This class is an alias of the :class:`~transformers.TokenClassificationPipeline` defined below. Please refer to that
pipeline for documentation and usage examples. pipeline for documentation and usage examples.
QuestionAnsweringPipeline QuestionAnsweringPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.QuestionAnsweringPipeline .. autoclass:: transformers.QuestionAnsweringPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
SummarizationPipeline SummarizationPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.SummarizationPipeline .. autoclass:: transformers.SummarizationPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
TextClassificationPipeline TextClassificationPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.TextClassificationPipeline .. autoclass:: transformers.TextClassificationPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
TextGenerationPipeline TextGenerationPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.TextGenerationPipeline .. autoclass:: transformers.TextGenerationPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
Text2TextGenerationPipeline Text2TextGenerationPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.Text2TextGenerationPipeline .. autoclass:: transformers.Text2TextGenerationPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
TokenClassificationPipeline TokenClassificationPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.TokenClassificationPipeline .. autoclass:: transformers.TokenClassificationPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
ZeroShotClassificationPipeline ZeroShotClassificationPipeline
========================================== =======================================================================================================================
.. autoclass:: transformers.ZeroShotClassificationPipeline .. autoclass:: transformers.ZeroShotClassificationPipeline
:special-members: __call__ :special-members: __call__
:members: :members:
Parent class: :obj:`Pipeline` Parent class: :obj:`Pipeline`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.Pipeline .. autoclass:: transformers.Pipeline
:members: :members:
Processors Processors
---------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
This library includes processors for several traditional tasks. These processors can be used to process a dataset into This library includes processors for several traditional tasks. These processors can be used to process a dataset into
examples that can be fed to a model. examples that can be fed to a model.
Processors Processors
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All processors follow the same architecture which is that of the All processors follow the same architecture which is that of the
:class:`~transformers.data.processors.utils.DataProcessor`. The processor returns a list :class:`~transformers.data.processors.utils.DataProcessor`. The processor returns a list
...@@ -26,7 +26,7 @@ of :class:`~transformers.data.processors.utils.InputExample`. These ...@@ -26,7 +26,7 @@ of :class:`~transformers.data.processors.utils.InputExample`. These
GLUE GLUE
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`General Language Understanding Evaluation (GLUE) <https://gluebenchmark.com/>`__ is a benchmark that evaluates `General Language Understanding Evaluation (GLUE) <https://gluebenchmark.com/>`__ is a benchmark that evaluates
the performance of models across a diverse set of existing NLU tasks. It was released together with the paper the performance of models across a diverse set of existing NLU tasks. It was released together with the paper
...@@ -52,13 +52,13 @@ Additionally, the following method can be used to load values from a data file ...@@ -52,13 +52,13 @@ Additionally, the following method can be used to load values from a data file
.. automethod:: transformers.data.processors.glue.glue_convert_examples_to_features .. automethod:: transformers.data.processors.glue.glue_convert_examples_to_features
Example usage Example usage
^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
An example using these processors is given in the `run_glue.py <https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_glue.py>`__ script. An example using these processors is given in the `run_glue.py <https://github.com/huggingface/pytorch-transformers/blob/master/examples/text-classification/run_glue.py>`__ script.
XNLI XNLI
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`The Cross-Lingual NLI Corpus (XNLI) <https://www.nyu.edu/projects/bowman/xnli/>`__ is a benchmark that evaluates `The Cross-Lingual NLI Corpus (XNLI) <https://www.nyu.edu/projects/bowman/xnli/>`__ is a benchmark that evaluates
the quality of cross-lingual text representations. the quality of cross-lingual text representations.
...@@ -78,7 +78,7 @@ An example using these processors is given in the ...@@ -78,7 +78,7 @@ An example using these processors is given in the
SQuAD SQuAD
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`The Stanford Question Answering Dataset (SQuAD) <https://rajpurkar.github.io/SQuAD-explorer//>`__ is a benchmark that evaluates `The Stanford Question Answering Dataset (SQuAD) <https://rajpurkar.github.io/SQuAD-explorer//>`__ is a benchmark that evaluates
the performance of models on question answering. Two versions are available, v1.1 and v2.0. The first version (v1.1) was released together with the paper the performance of models on question answering. Two versions are available, v1.1 and v2.0. The first version (v1.1) was released together with the paper
...@@ -88,7 +88,7 @@ the paper `Know What You Don't Know: Unanswerable Questions for SQuAD <https://a ...@@ -88,7 +88,7 @@ the paper `Know What You Don't Know: Unanswerable Questions for SQuAD <https://a
This library hosts a processor for each of the two versions: This library hosts a processor for each of the two versions:
Processors Processors
^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Those processors are: Those processors are:
- :class:`~transformers.data.processors.utils.SquadV1Processor` - :class:`~transformers.data.processors.utils.SquadV1Processor`
...@@ -109,7 +109,7 @@ Examples are given below. ...@@ -109,7 +109,7 @@ Examples are given below.
Example usage Example usage
^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is an example using the processors as well as the conversion method using data files: Here is an example using the processors as well as the conversion method using data files:
Example:: Example::
......
Tokenizer Tokenizer
---------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
A tokenizer is in charge of preparing the inputs for a model. The library contains tokenizers for all the models. Most A tokenizer is in charge of preparing the inputs for a model. The library contains tokenizers for all the models. Most
of the tokenizers are available in two flavors: a full python implementation and a "Fast" implementation based on the of the tokenizers are available in two flavors: a full python implementation and a "Fast" implementation based on the
...@@ -36,24 +36,24 @@ alignment methods which can be used to map between the original string (characte ...@@ -36,24 +36,24 @@ alignment methods which can be used to map between the original string (characte
getting the index of the token comprising a given character or the span of characters corresponding to a given token). getting the index of the token comprising a given character or the span of characters corresponding to a given token).
``PreTrainedTokenizer`` PreTrainedTokenizer
~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.PreTrainedTokenizer .. autoclass:: transformers.PreTrainedTokenizer
:special-members: __call__ :special-members: __call__
:members: :members:
``PreTrainedTokenizerFast`` PreTrainedTokenizerFast
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.PreTrainedTokenizerFast .. autoclass:: transformers.PreTrainedTokenizerFast
:special-members: __call__ :special-members: __call__
:members: :members:
``BatchEncoding`` BatchEncoding
~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.BatchEncoding .. autoclass:: transformers.BatchEncoding
:members: :members:
Trainer Trainer
---------- -----------------------------------------------------------------------------------------------------------------------
The :class:`~transformers.Trainer` and :class:`~transformers.TFTrainer` classes provide an API for feature-complete The :class:`~transformers.Trainer` and :class:`~transformers.TFTrainer` classes provide an API for feature-complete
training in most standard use cases. It's used in most of the :doc:`example scripts <../examples>`. training in most standard use cases. It's used in most of the :doc:`example scripts <../examples>`.
Before instantiating your :class:`~transformers.Trainer`/:class:`~transformers.TFTrainer`, create a Before instantiating your :class:`~transformers.Trainer`/:class:`~transformers.TFTrainer`, create a
:class:`~transformers.TrainingArguments`/:class:`~transformers.TFTrainingArguments` to access all the points of :class:`~transformers.TrainingArguments`/:class:`~transformers.TFTrainingArguments` to access all the points of
customization during training. customization during training.
The API supports distributed training on multiple GPUs/TPUs, mixed precision through `NVIDIA Apex The API supports distributed training on multiple GPUs/TPUs, mixed precision through `NVIDIA Apex
<https://github.com/NVIDIA/apex>`__ for PyTorch and :obj:`tf.keras.mixed_precision` for TensorFlow. <https://github.com/NVIDIA/apex>`__ for PyTorch and :obj:`tf.keras.mixed_precision` for TensorFlow.
Both :class:`~transformers.Trainer` and :class:`~transformers.TFTrainer` contain the basic training loop supporting the Both :class:`~transformers.Trainer` and :class:`~transformers.TFTrainer` contain the basic training loop supporting the
previous features. To inject custom behavior you can subclass them and override the following methods: previous features. To inject custom behavior you can subclass them and override the following methods:
- **get_train_dataloader**/**get_train_tfdataset** -- Creates the training DataLoader (PyTorch) or TF Dataset. - **get_train_dataloader**/**get_train_tfdataset** -- Creates the training DataLoader (PyTorch) or TF Dataset.
- **get_eval_dataloader**/**get_eval_tfdataset** -- Creates the evaulation DataLoader (PyTorch) or TF Dataset. - **get_eval_dataloader**/**get_eval_tfdataset** -- Creates the evaulation DataLoader (PyTorch) or TF Dataset.
- **get_test_dataloader**/**get_test_tfdataset** -- Creates the test DataLoader (PyTorch) or TF Dataset. - **get_test_dataloader**/**get_test_tfdataset** -- Creates the test DataLoader (PyTorch) or TF Dataset.
- **log** -- Logs information on the various objects watching training. - **log** -- Logs information on the various objects watching training.
- **setup_wandb** -- Setups wandb (see `here <https://docs.wandb.com/huggingface>`__ for more information). - **setup_wandb** -- Setups wandb (see `here <https://docs.wandb.com/huggingface>`__ for more information).
- **create_optimizer_and_scheduler** -- Setups the optimizer and learning rate scheduler if they were not passed at - **create_optimizer_and_scheduler** -- Setups the optimizer and learning rate scheduler if they were not passed at
init. init.
- **compute_loss** - Computes the loss on a batch of training inputs. - **compute_loss** - Computes the loss on a batch of training inputs.
- **training_step** -- Performs a training step. - **training_step** -- Performs a training step.
- **prediction_step** -- Performs an evaluation/test step. - **prediction_step** -- Performs an evaluation/test step.
- **run_model** (TensorFlow only) -- Basic pass through the model. - **run_model** (TensorFlow only) -- Basic pass through the model.
- **evaluate** -- Runs an evaluation loop and returns metrics. - **evaluate** -- Runs an evaluation loop and returns metrics.
- **predict** -- Returns predictions (with metrics if labels are available) on a test set. - **predict** -- Returns predictions (with metrics if labels are available) on a test set.
Here is an example of how to customize :class:`~transformers.Trainer` using a custom loss function: Here is an example of how to customize :class:`~transformers.Trainer` using a custom loss function:
.. code-block:: python .. code-block:: python
from transformers import Trainer from transformers import Trainer
class MyTrainer(Trainer): class MyTrainer(Trainer):
def compute_loss(self, model, inputs): def compute_loss(self, model, inputs):
labels = inputs.pop("labels") labels = inputs.pop("labels")
outputs = models(**inputs) outputs = models(**inputs)
logits = outputs[0] logits = outputs[0]
return my_custom_loss(logits, labels) return my_custom_loss(logits, labels)
``Trainer`` Trainer
~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.Trainer .. autoclass:: transformers.Trainer
:members: :members:
``TFTrainer`` TFTrainer
~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFTrainer .. autoclass:: transformers.TFTrainer
:members: :members:
``TrainingArguments`` TrainingArguments
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TrainingArguments .. autoclass:: transformers.TrainingArguments
:members: :members:
``TFTrainingArguments`` TFTrainingArguments
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFTrainingArguments .. autoclass:: transformers.TFTrainingArguments
:members: :members:
Utilities Utilities
~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.EvalPrediction .. autoclass:: transformers.EvalPrediction
.. autofunction:: transformers.set_seed .. autofunction:: transformers.set_seed
.. autofunction:: transformers.torch_distributed_zero_first .. autofunction:: transformers.torch_distributed_zero_first
ALBERT ALBERT
---------------------------------------------------- -----------------------------------------------------------------------------------------------------------------------
Overview Overview
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations <https://arxiv.org/abs/1909.11942>`_ The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. It presents <https://arxiv.org/abs/1909.11942>`__ by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma,
two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT: Radu Soricut. It presents two parameter-reduction techniques to lower memory consumption and increase the training
speed of BERT:
- Splitting the embedding matrix into two smaller matrices - Splitting the embedding matrix into two smaller matrices.
- Using repeating layers split among groups - Using repeating layers split among groups.
The abstract from the paper is the following: The abstract from the paper is the following:
...@@ -30,17 +31,17 @@ Tips: ...@@ -30,17 +31,17 @@ Tips:
similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same
number of (repeating) layers. number of (repeating) layers.
The original code can be found `here <https://github.com/google-research/ALBERT>`_. The original code can be found `here <https://github.com/google-research/ALBERT>`__.
AlbertConfig AlbertConfig
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertConfig .. autoclass:: transformers.AlbertConfig
:members: :members:
AlbertTokenizer AlbertTokenizer
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertTokenizer .. autoclass:: transformers.AlbertTokenizer
:members: build_inputs_with_special_tokens, get_special_tokens_mask, :members: build_inputs_with_special_tokens, get_special_tokens_mask,
...@@ -48,7 +49,7 @@ AlbertTokenizer ...@@ -48,7 +49,7 @@ AlbertTokenizer
Albert specific outputs Albert specific outputs
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.modeling_albert.AlbertForPreTrainingOutput .. autoclass:: transformers.modeling_albert.AlbertForPreTrainingOutput
:members: :members:
...@@ -58,98 +59,98 @@ Albert specific outputs ...@@ -58,98 +59,98 @@ Albert specific outputs
AlbertModel AlbertModel
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertModel .. autoclass:: transformers.AlbertModel
:members: :members: forward
AlbertForPreTraining AlbertForPreTraining
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertForPreTraining .. autoclass:: transformers.AlbertForPreTraining
:members: :members: forward
AlbertForMaskedLM AlbertForMaskedLM
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertForMaskedLM .. autoclass:: transformers.AlbertForMaskedLM
:members: :members: forward
AlbertForSequenceClassification AlbertForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertForSequenceClassification .. autoclass:: transformers.AlbertForSequenceClassification
:members: :members: forward
AlbertForMultipleChoice AlbertForMultipleChoice
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertForMultipleChoice .. autoclass:: transformers.AlbertForMultipleChoice
:members: :members:
AlbertForTokenClassification AlbertForTokenClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertForTokenClassification .. autoclass:: transformers.AlbertForTokenClassification
:members: :members: forward
AlbertForQuestionAnswering AlbertForQuestionAnswering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AlbertForQuestionAnswering .. autoclass:: transformers.AlbertForQuestionAnswering
:members: :members: forward
TFAlbertModel TFAlbertModel
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAlbertModel .. autoclass:: transformers.TFAlbertModel
:members: :members: call
TFAlbertForPreTraining TFAlbertForPreTraining
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAlbertForPreTraining .. autoclass:: transformers.TFAlbertForPreTraining
:members: :members: call
TFAlbertForMaskedLM TFAlbertForMaskedLM
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAlbertForMaskedLM .. autoclass:: transformers.TFAlbertForMaskedLM
:members: :members: call
TFAlbertForSequenceClassification TFAlbertForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAlbertForSequenceClassification .. autoclass:: transformers.TFAlbertForSequenceClassification
:members: :members: call
TFAlbertForMultipleChoice TFAlbertForMultipleChoice
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAlbertForMultipleChoice .. autoclass:: transformers.TFAlbertForMultipleChoice
:members: :members: call
TFAlbertForTokenClassification TFAlbertForTokenClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAlbertForTokenClassification .. autoclass:: transformers.TFAlbertForTokenClassification
:members: :members: call
TFAlbertForQuestionAnswering TFAlbertForQuestionAnswering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAlbertForQuestionAnswering .. autoclass:: transformers.TFAlbertForQuestionAnswering
:members: :members: call
AutoClasses AutoClasses
----------- -----------------------------------------------------------------------------------------------------------------------
In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you
are supplying to the :obj:`from_pretrained()` method. are supplying to the :obj:`from_pretrained()` method.
...@@ -20,112 +20,112 @@ There is one class of :obj:`AutoModel` for each task, and for each backend (PyTo ...@@ -20,112 +20,112 @@ There is one class of :obj:`AutoModel` for each task, and for each backend (PyTo
AutoConfig AutoConfig
~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoConfig .. autoclass:: transformers.AutoConfig
:members: :members:
AutoTokenizer AutoTokenizer
~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoTokenizer .. autoclass:: transformers.AutoTokenizer
:members: :members:
AutoModel AutoModel
~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoModel .. autoclass:: transformers.AutoModel
:members: :members:
AutoModelForPreTraining AutoModelForPreTraining
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoModelForPreTraining .. autoclass:: transformers.AutoModelForPreTraining
:members: :members:
AutoModelWithLMHead AutoModelWithLMHead
~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoModelWithLMHead .. autoclass:: transformers.AutoModelWithLMHead
:members: :members:
AutoModelForSequenceClassification AutoModelForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoModelForSequenceClassification .. autoclass:: transformers.AutoModelForSequenceClassification
:members: :members:
AutoModelForMultipleChoice AutoModelForMultipleChoice
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoModelForMultipleChoice .. autoclass:: transformers.AutoModelForMultipleChoice
:members: :members:
AutoModelForTokenClassification AutoModelForTokenClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoModelForTokenClassification .. autoclass:: transformers.AutoModelForTokenClassification
:members: :members:
AutoModelForQuestionAnswering AutoModelForQuestionAnswering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.AutoModelForQuestionAnswering .. autoclass:: transformers.AutoModelForQuestionAnswering
:members: :members:
TFAutoModel TFAutoModel
~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAutoModel .. autoclass:: transformers.TFAutoModel
:members: :members:
TFAutoModelForPreTraining TFAutoModelForPreTraining
~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAutoModelForPreTraining .. autoclass:: transformers.TFAutoModelForPreTraining
:members: :members:
TFAutoModelWithLMHead TFAutoModelWithLMHead
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAutoModelWithLMHead .. autoclass:: transformers.TFAutoModelWithLMHead
:members: :members:
TFAutoModelForSequenceClassification TFAutoModelForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAutoModelForSequenceClassification .. autoclass:: transformers.TFAutoModelForSequenceClassification
:members: :members:
TFAutoModelForMultipleChoice TFAutoModelForMultipleChoice
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAutoModelForMultipleChoice .. autoclass:: transformers.TFAutoModelForMultipleChoice
:members: :members:
TFAutoModelForTokenClassification TFAutoModelForTokenClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAutoModelForTokenClassification .. autoclass:: transformers.TFAutoModelForTokenClassification
:members: :members:
TFAutoModelForQuestionAnswering TFAutoModelForQuestionAnswering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.TFAutoModelForQuestionAnswering .. autoclass:: transformers.TFAutoModelForQuestionAnswering
:members: :members:
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment