roberta.rst 7.9 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
.. 
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

LysandreJik's avatar
Doc  
LysandreJik committed
13
RoBERTa
Sylvain Gugger's avatar
Sylvain Gugger committed
14
-----------------------------------------------------------------------------------------------------------------------
LysandreJik's avatar
Doc  
LysandreJik committed
15

Sylvain Gugger's avatar
Sylvain Gugger committed
16
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
18

Sylvain Gugger's avatar
Sylvain Gugger committed
19
20
21
The RoBERTa model was proposed in `RoBERTa: A Robustly Optimized BERT Pretraining Approach
<https://arxiv.org/abs/1907.11692>`_ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer
Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. It is based on Google's BERT model released in 2018.
Lysandre's avatar
Lysandre committed
22

Sylvain Gugger's avatar
Sylvain Gugger committed
23
24
It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining objective and training with
much larger mini-batches and learning rates.
Lysandre's avatar
Lysandre committed
25

Lysandre's avatar
Lysandre committed
26
27
28
29
30
31
The abstract from the paper is the following:

*Language model pretraining has led to significant performance gains but careful comparison between different
approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes,
and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication
study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and
Sylvain Gugger's avatar
Sylvain Gugger committed
32
33
34
35
training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of every
model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These results
highlight the importance of previously overlooked design choices, and raise questions about the source of recently
reported improvements. We release our models and code.*
Lysandre's avatar
Lysandre committed
36
37
38

Tips:

Sylvain Gugger's avatar
Sylvain Gugger committed
39
40
- This implementation is the same as :class:`~transformers.BertModel` with a tiny embeddings tweak as well as a setup
  for Roberta pretrained models.
Lysandre's avatar
Lysandre committed
41
- RoBERTa has the same architecture as BERT, but uses a byte-level BPE as a tokenizer (same as GPT-2) and uses a
Sylvain Gugger's avatar
Sylvain Gugger committed
42
43
44
45
  different pretraining scheme.
- RoBERTa doesn't have :obj:`token_type_ids`, you don't need to indicate which token belongs to which segment. Just
  separate your segments with the separation token :obj:`tokenizer.sep_token` (or :obj:`</s>`)
- :doc:`CamemBERT <camembert>` is a wrapper around RoBERTa. Refer to this page for usage examples.
Lysandre's avatar
Lysandre committed
46

47
48
This model was contributed by `julien-c <https://huggingface.co/julien-c>`__. The original code can be found `here
<https://github.com/pytorch/fairseq/tree/master/examples/roberta>`_.
49
50


Lysandre's avatar
Lysandre committed
51
RobertaConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
52
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
53

54
.. autoclass:: transformers.RobertaConfig
LysandreJik's avatar
Doc  
LysandreJik committed
55
56
57
    :members:


Lysandre's avatar
Lysandre committed
58
RobertaTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
59
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
60

61
.. autoclass:: transformers.RobertaTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
62
63
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
LysandreJik's avatar
Doc  
LysandreJik committed
64
65


66
RobertaTokenizerFast
Sylvain Gugger's avatar
Sylvain Gugger committed
67
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68
69
70
71
72

.. autoclass:: transformers.RobertaTokenizerFast
    :members: build_inputs_with_special_tokens


Lysandre's avatar
Lysandre committed
73
RobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
74
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
75

76
.. autoclass:: transformers.RobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
77
    :members: forward
LysandreJik's avatar
Doc  
LysandreJik committed
78
79


80
RobertaForCausalLM
Sylvain Gugger's avatar
Sylvain Gugger committed
81
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
82
83

.. autoclass:: transformers.RobertaForCausalLM
Sylvain Gugger's avatar
Sylvain Gugger committed
84
    :members: forward
85
86


Lysandre's avatar
Lysandre committed
87
RobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
88
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
89

90
.. autoclass:: transformers.RobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
91
    :members: forward
LysandreJik's avatar
Doc  
LysandreJik committed
92
93


Lysandre's avatar
Lysandre committed
94
RobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
95
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
96

97
.. autoclass:: transformers.RobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
98
    :members: forward
LysandreJik's avatar
LysandreJik committed
99
100


101
RobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
102
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
103
104

.. autoclass:: transformers.RobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
105
    :members: forward
106
107


Lysandre's avatar
Lysandre committed
108
RobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
109
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
110
111

.. autoclass:: transformers.RobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
112
    :members: forward
Lysandre's avatar
Lysandre committed
113

Sylvain Gugger's avatar
Sylvain Gugger committed
114
115

RobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
116
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
117
118

.. autoclass:: transformers.RobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
119
    :members: forward
Sylvain Gugger's avatar
Sylvain Gugger committed
120
121


Lysandre's avatar
Lysandre committed
122
TFRobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
123
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
124

125
.. autoclass:: transformers.TFRobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
126
    :members: call
LysandreJik's avatar
LysandreJik committed
127
128


Lysandre's avatar
Lysandre committed
129
TFRobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
130
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
131

132
.. autoclass:: transformers.TFRobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
133
    :members: call
LysandreJik's avatar
LysandreJik committed
134
135


Lysandre's avatar
Lysandre committed
136
TFRobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
137
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
138

139
.. autoclass:: transformers.TFRobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
140
    :members: call
Lysandre's avatar
Lysandre committed
141
142


Sylvain Gugger's avatar
Sylvain Gugger committed
143
TFRobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
144
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
145
146

.. autoclass:: transformers.TFRobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
147
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
148
149


Lysandre's avatar
Lysandre committed
150
TFRobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
151
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
152
153

.. autoclass:: transformers.TFRobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
154
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
155
156
157


TFRobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
158
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
159
160

.. autoclass:: transformers.TFRobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
161
    :members: call
162
163
164
165
166
167
168


FlaxRobertaModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxRobertaModel
    :members: __call__
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203


FlaxRobertaForMaskedLM
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxRobertaForMaskedLM
    :members: __call__


FlaxRobertaForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxRobertaForSequenceClassification
    :members: __call__


FlaxRobertaForMultipleChoice
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxRobertaForMultipleChoice
    :members: __call__


FlaxRobertaForTokenClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxRobertaForTokenClassification
    :members: __call__


FlaxRobertaForQuestionAnswering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxRobertaForQuestionAnswering
    :members: __call__