bert.rst 4.88 KB
Newer Older
1
2
3
BERT
----------------------------------------------------

Lysandre's avatar
Lysandre committed
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Overview
~~~~~~~~~~~~~~~~~~~~~

The BERT model was proposed in `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`__
by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It's a bidirectional transformer
pre-trained using a combination of masked language modeling objective and next sentence prediction
on a large corpus comprising the Toronto Book Corpus and Wikipedia.

The abstract from the paper is the following:

*We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations
from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional
representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result,
the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models
for a wide range of tasks, such as question answering and language inference, without substantial task-specific
architecture modifications.*

*BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural
language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI
accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute
improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).*

Tips:

- BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
  the right rather than the left.
30
31
32
33
34
35
36
- BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
  tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
  modeling (CLM) objective are better in that regard.
- Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
  approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
  prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
  the [CLS] token.
Lysandre's avatar
Lysandre committed
37

38
39
The original code can be found `here <https://github.com/google-research/bert>`_.

Lysandre's avatar
Lysandre committed
40
BertConfig
41
42
~~~~~~~~~~~~~~~~~~~~~

43
.. autoclass:: transformers.BertConfig
44
    :members:
45
46


Lysandre's avatar
Lysandre committed
47
BertTokenizer
48
~~~~~~~~~~~~~~~~~~~~~
49

50
.. autoclass:: transformers.BertTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
51
52
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
53
54


55
56
57
58
59
60
61
BertTokenizerFast
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BertTokenizerFast
    :members:


62
63
64
65
66
67
68
Bert specific outputs
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.modeling_bert.BertForPretrainingOutput
    :members:


Lysandre's avatar
Lysandre committed
69
BertModel
70
71
~~~~~~~~~~~~~~~~~~~~

72
.. autoclass:: transformers.BertModel
73
74
75
    :members:


Lysandre's avatar
Lysandre committed
76
BertForPreTraining
77
78
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

79
.. autoclass:: transformers.BertForPreTraining
80
81
82
    :members:


Lysandre's avatar
Lysandre committed
83
BertForMaskedLM
84
85
~~~~~~~~~~~~~~~~~~~~~~~~~~

86
.. autoclass:: transformers.BertForMaskedLM
87
88
89
    :members:


Lysandre's avatar
Lysandre committed
90
BertForNextSentencePrediction
91
92
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

93
.. autoclass:: transformers.BertForNextSentencePrediction
94
95
96
    :members:


Lysandre's avatar
Lysandre committed
97
BertForSequenceClassification
98
99
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

100
.. autoclass:: transformers.BertForSequenceClassification
101
102
103
    :members:


Lysandre's avatar
Lysandre committed
104
BertForMultipleChoice
105
106
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

107
.. autoclass:: transformers.BertForMultipleChoice
108
109
110
    :members:


Lysandre's avatar
Lysandre committed
111
BertForTokenClassification
112
113
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

114
.. autoclass:: transformers.BertForTokenClassification
115
116
117
    :members:


Lysandre's avatar
Lysandre committed
118
BertForQuestionAnswering
119
120
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

121
.. autoclass:: transformers.BertForQuestionAnswering
122
123
    :members:

LysandreJik's avatar
LysandreJik committed
124

Lysandre's avatar
Lysandre committed
125
TFBertModel
LysandreJik's avatar
LysandreJik committed
126
127
~~~~~~~~~~~~~~~~~~~~

128
.. autoclass:: transformers.TFBertModel
LysandreJik's avatar
LysandreJik committed
129
130
131
    :members:


Lysandre's avatar
Lysandre committed
132
TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
133
134
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

135
.. autoclass:: transformers.TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
136
137
138
    :members:


Lysandre's avatar
Lysandre committed
139
TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
140
141
~~~~~~~~~~~~~~~~~~~~~~~~~~

142
.. autoclass:: transformers.TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
143
144
145
    :members:


Lysandre's avatar
Lysandre committed
146
TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
147
148
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

149
.. autoclass:: transformers.TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
150
151
152
    :members:


Lysandre's avatar
Lysandre committed
153
TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
154
155
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

156
.. autoclass:: transformers.TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
157
158
159
    :members:


Lysandre's avatar
Lysandre committed
160
TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
161
162
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

163
.. autoclass:: transformers.TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
164
165
166
    :members:


Lysandre's avatar
Lysandre committed
167
TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
168
169
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

170
.. autoclass:: transformers.TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
171
172
173
    :members:


Lysandre's avatar
Lysandre committed
174
TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
175
176
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

177
.. autoclass:: transformers.TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
178
179
    :members: