roberta.rst 5.87 KB
Newer Older
LysandreJik's avatar
Doc  
LysandreJik committed
1
RoBERTa
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
LysandreJik's avatar
Doc  
LysandreJik committed
3

Sylvain Gugger's avatar
Sylvain Gugger committed
4
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
5
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
6

Sylvain Gugger's avatar
Sylvain Gugger committed
7
8
9
The RoBERTa model was proposed in `RoBERTa: A Robustly Optimized BERT Pretraining Approach
<https://arxiv.org/abs/1907.11692>`_ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer
Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. It is based on Google's BERT model released in 2018.
Lysandre's avatar
Lysandre committed
10
11
12
13

It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining
objective and training with much larger mini-batches and learning rates.

Lysandre's avatar
Lysandre committed
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
The abstract from the paper is the following:

*Language model pretraining has led to significant performance gains but careful comparison between different
approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes,
and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication
study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and
training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of
every model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These
results highlight the importance of previously overlooked design choices, and raise questions about the source
of recently reported improvements. We release our models and code.*

Tips:

- This implementation is the same as :class:`~transformers.BertModel` with a tiny embeddings tweak as well as a
  setup for Roberta pretrained models.
Lysandre's avatar
Lysandre committed
29
- RoBERTa has the same architecture as BERT, but uses a byte-level BPE as a tokenizer (same as GPT-2) and uses a
Sylvain Gugger's avatar
Sylvain Gugger committed
30
31
32
33
  different pretraining scheme.
- RoBERTa doesn't have :obj:`token_type_ids`, you don't need to indicate which token belongs to which segment. Just
  separate your segments with the separation token :obj:`tokenizer.sep_token` (or :obj:`</s>`)
- :doc:`CamemBERT <camembert>` is a wrapper around RoBERTa. Refer to this page for usage examples.
Lysandre's avatar
Lysandre committed
34

35
36
37
The original code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`_.


Lysandre's avatar
Lysandre committed
38
RobertaConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
39
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
40

41
.. autoclass:: transformers.RobertaConfig
LysandreJik's avatar
Doc  
LysandreJik committed
42
43
44
    :members:


Lysandre's avatar
Lysandre committed
45
RobertaTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
46
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
47

48
.. autoclass:: transformers.RobertaTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
49
50
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
LysandreJik's avatar
Doc  
LysandreJik committed
51
52


53
RobertaTokenizerFast
Sylvain Gugger's avatar
Sylvain Gugger committed
54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55
56
57
58
59

.. autoclass:: transformers.RobertaTokenizerFast
    :members: build_inputs_with_special_tokens


Lysandre's avatar
Lysandre committed
60
RobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
61
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
62

63
.. autoclass:: transformers.RobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
64
    :members: forward
LysandreJik's avatar
Doc  
LysandreJik committed
65
66


67
RobertaForCausalLM
Sylvain Gugger's avatar
Sylvain Gugger committed
68
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69
70

.. autoclass:: transformers.RobertaForCausalLM
Sylvain Gugger's avatar
Sylvain Gugger committed
71
    :members: forward
72
73


Lysandre's avatar
Lysandre committed
74
RobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
75
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
76

77
.. autoclass:: transformers.RobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
78
    :members: forward
LysandreJik's avatar
Doc  
LysandreJik committed
79
80


Lysandre's avatar
Lysandre committed
81
RobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
82
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
Doc  
LysandreJik committed
83

84
.. autoclass:: transformers.RobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
85
    :members: forward
LysandreJik's avatar
LysandreJik committed
86
87


88
RobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
89
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
90
91

.. autoclass:: transformers.RobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
92
    :members: forward
93
94


Lysandre's avatar
Lysandre committed
95
RobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
96
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
97
98

.. autoclass:: transformers.RobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
99
    :members: forward
Lysandre's avatar
Lysandre committed
100

Sylvain Gugger's avatar
Sylvain Gugger committed
101
102

RobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
103
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
104
105

.. autoclass:: transformers.RobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
106
    :members: forward
Sylvain Gugger's avatar
Sylvain Gugger committed
107
108


Lysandre's avatar
Lysandre committed
109
TFRobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
110
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
111

112
.. autoclass:: transformers.TFRobertaModel
Sylvain Gugger's avatar
Sylvain Gugger committed
113
    :members: call
LysandreJik's avatar
LysandreJik committed
114
115


Lysandre's avatar
Lysandre committed
116
TFRobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
117
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
118

119
.. autoclass:: transformers.TFRobertaForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
120
    :members: call
LysandreJik's avatar
LysandreJik committed
121
122


Lysandre's avatar
Lysandre committed
123
TFRobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
124
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
125

126
.. autoclass:: transformers.TFRobertaForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
127
    :members: call
Lysandre's avatar
Lysandre committed
128
129


Sylvain Gugger's avatar
Sylvain Gugger committed
130
TFRobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
131
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
132
133

.. autoclass:: transformers.TFRobertaForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
134
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
135
136


Lysandre's avatar
Lysandre committed
137
TFRobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
138
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
139
140

.. autoclass:: transformers.TFRobertaForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
141
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
142
143
144


TFRobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
145
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
146
147

.. autoclass:: transformers.TFRobertaForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
148
    :members: call