roberta.rst 6.67 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
.. 
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

LysandreJik's avatar
Doc  
LysandreJik committed
13
RoBERTa
Sylvain Gugger's avatar
Sylvain Gugger committed
14
-----------------------------------------------------------------------------------------------------------------------
LysandreJik's avatar
Doc  
LysandreJik committed
15

Sylvain Gugger's avatar
Sylvain Gugger committed
16
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
18

Sylvain Gugger's avatar
Sylvain Gugger committed
19
20
21
The RoBERTa model was proposed in `RoBERTa: A Robustly Optimized BERT Pretraining Approach
<https://arxiv.org/abs/1907.11692>`_ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer
Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. It is based on Google's BERT model released in 2018.
Lysandre's avatar
Lysandre committed
22

Sylvain Gugger's avatar
Sylvain Gugger committed
23
24
It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining objective and training with
much larger mini-batches and learning rates.
Lysandre's avatar
Lysandre committed
25

Lysandre's avatar
Lysandre committed
26
27
28
29
30
31
The abstract from the paper is the following:

*Language model pretraining has led to significant performance gains but careful comparison between different
approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes,
and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication
study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and
Sylvain Gugger's avatar
Sylvain Gugger committed
32
33
34
35
training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of every
model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These results
highlight the importance of previously overlooked design choices, and raise questions about the source of recently
reported improvements. We release our models and code.*
Lysandre's avatar
Lysandre committed
36
37
38

Tips:

Sylvain Gugger's avatar
Sylvain Gugger committed
39
40
- This implementation is the same as :class:`~transformers.BertModel` with a tiny embeddings tweak as well as a setup
  for Roberta pretrained models.
Lysandre's avatar
Lysandre committed
41
- RoBERTa has the same architecture as BERT, but uses a byte-level BPE as a tokenizer (same as GPT-2) and uses a
Sylvain Gugger's avatar
Sylvain Gugger committed
42
43
44
45
  different pretraining scheme.
- RoBERTa doesn't have :obj:`token_type_ids`, you don't need to indicate which token belongs to which segment. Just
  separate your segments with the separation token :obj:`tokenizer.sep_token` (or :obj:`</s>`)
- :doc:`CamemBERT <camembert>` is a wrapper around RoBERTa. Refer to this page for usage examples.
Lysandre's avatar
Lysandre committed
46

47
48
49
The original code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`_.


Lysandre's avatar
Lysandre committed
50
RobertaConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
51
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
52

53
.. autoclass:: transformers.RobertaConfig
LysandreJik's avatar
Doc  
LysandreJik committed
54
55
56
    :members:


Lysandre's avatar
Lysandre committed
57
RobertaTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
58
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
59

60
.. autoclass:: transformers.RobertaTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
61
62
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
LysandreJik's avatar
Doc  
LysandreJik committed
63
64


65
RobertaTokenizerFast
Sylvain Gugger's avatar
Sylvain Gugger committed
66
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
67
68
69
70
71

.. autoclass:: transformers.RobertaTokenizerFast
    :members: build_inputs_with_special_tokens


Lysandre's avatar
Lysandre committed
72
RobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
73
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
74

75
.. autoclass:: transformers.RobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
76
    :members: forward
LysandreJik's avatar
Doc  
LysandreJik committed
77
78


79
RobertaForCausalLM
Sylvain Gugger's avatar
Sylvain Gugger committed
80
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
81
82

.. autoclass:: transformers.RobertaForCausalLM
Sylvain Gugger's avatar
Sylvain Gugger committed
83
    :members: forward
84
85


Lysandre's avatar
Lysandre committed
86
RobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
87
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
88

89
.. autoclass:: transformers.RobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
90
    :members: forward
LysandreJik's avatar
Doc  
LysandreJik committed
91
92


Lysandre's avatar
Lysandre committed
93
RobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
94
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
95

96
.. autoclass:: transformers.RobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
97
    :members: forward
LysandreJik's avatar
LysandreJik committed
98
99


100
RobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
101
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
102
103

.. autoclass:: transformers.RobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
104
    :members: forward
105
106


Lysandre's avatar
Lysandre committed
107
RobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
108
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
109
110

.. autoclass:: transformers.RobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
111
    :members: forward
Lysandre's avatar
Lysandre committed
112

Sylvain Gugger's avatar
Sylvain Gugger committed
113
114

RobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
115
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
116
117

.. autoclass:: transformers.RobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
118
    :members: forward
Sylvain Gugger's avatar
Sylvain Gugger committed
119
120


Lysandre's avatar
Lysandre committed
121
TFRobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
122
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
123

124
.. autoclass:: transformers.TFRobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
125
    :members: call
LysandreJik's avatar
LysandreJik committed
126
127


Lysandre's avatar
Lysandre committed
128
TFRobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
129
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
130

131
.. autoclass:: transformers.TFRobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
132
    :members: call
LysandreJik's avatar
LysandreJik committed
133
134


Lysandre's avatar
Lysandre committed
135
TFRobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
136
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
137

138
.. autoclass:: transformers.TFRobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
139
    :members: call
Lysandre's avatar
Lysandre committed
140
141


Sylvain Gugger's avatar
Sylvain Gugger committed
142
TFRobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
143
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
144
145

.. autoclass:: transformers.TFRobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
146
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
147
148


Lysandre's avatar
Lysandre committed
149
TFRobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
150
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
151
152

.. autoclass:: transformers.TFRobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
153
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
154
155
156


TFRobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
157
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
158
159

.. autoclass:: transformers.TFRobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
160
    :members: call
161
162
163
164
165
166
167


FlaxRobertaModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxRobertaModel
    :members: __call__