bert.rst 4.76 KB
Newer Older
1
2
3
BERT
----------------------------------------------------

Lysandre's avatar
Lysandre committed
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Overview
~~~~~~~~~~~~~~~~~~~~~

The BERT model was proposed in `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`__
by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It's a bidirectional transformer
pre-trained using a combination of masked language modeling objective and next sentence prediction
on a large corpus comprising the Toronto Book Corpus and Wikipedia.

The abstract from the paper is the following:

*We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations
from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional
representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result,
the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models
for a wide range of tasks, such as question answering and language inference, without substantial task-specific
architecture modifications.*

*BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural
language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI
accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute
improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).*

Tips:

- BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
  the right rather than the left.
30
31
32
33
34
35
36
- BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
  tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
  modeling (CLM) objective are better in that regard.
- Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
  approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
  prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
  the [CLS] token.
Lysandre's avatar
Lysandre committed
37

38
39
The original code can be found `here <https://github.com/google-research/bert>`_.

Lysandre's avatar
Lysandre committed
40
BertConfig
41
42
~~~~~~~~~~~~~~~~~~~~~

43
.. autoclass:: transformers.BertConfig
44
    :members:
45
46


Lysandre's avatar
Lysandre committed
47
BertTokenizer
48
~~~~~~~~~~~~~~~~~~~~~
49

50
.. autoclass:: transformers.BertTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
51
52
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
53
54


55
56
57
58
59
60
61
BertTokenizerFast
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BertTokenizerFast
    :members:


Lysandre's avatar
Lysandre committed
62
BertModel
63
64
~~~~~~~~~~~~~~~~~~~~

65
.. autoclass:: transformers.BertModel
66
67
68
    :members:


Lysandre's avatar
Lysandre committed
69
BertForPreTraining
70
71
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

72
.. autoclass:: transformers.BertForPreTraining
73
74
75
    :members:


Lysandre's avatar
Lysandre committed
76
BertForMaskedLM
77
78
~~~~~~~~~~~~~~~~~~~~~~~~~~

79
.. autoclass:: transformers.BertForMaskedLM
80
81
82
    :members:


Lysandre's avatar
Lysandre committed
83
BertForNextSentencePrediction
84
85
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

86
.. autoclass:: transformers.BertForNextSentencePrediction
87
88
89
    :members:


Lysandre's avatar
Lysandre committed
90
BertForSequenceClassification
91
92
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

93
.. autoclass:: transformers.BertForSequenceClassification
94
95
96
    :members:


Lysandre's avatar
Lysandre committed
97
BertForMultipleChoice
98
99
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

100
.. autoclass:: transformers.BertForMultipleChoice
101
102
103
    :members:


Lysandre's avatar
Lysandre committed
104
BertForTokenClassification
105
106
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

107
.. autoclass:: transformers.BertForTokenClassification
108
109
110
    :members:


Lysandre's avatar
Lysandre committed
111
BertForQuestionAnswering
112
113
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

114
.. autoclass:: transformers.BertForQuestionAnswering
115
116
    :members:

LysandreJik's avatar
LysandreJik committed
117

Lysandre's avatar
Lysandre committed
118
TFBertModel
LysandreJik's avatar
LysandreJik committed
119
120
~~~~~~~~~~~~~~~~~~~~

121
.. autoclass:: transformers.TFBertModel
LysandreJik's avatar
LysandreJik committed
122
123
124
    :members:


Lysandre's avatar
Lysandre committed
125
TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
126
127
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

128
.. autoclass:: transformers.TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
129
130
131
    :members:


Lysandre's avatar
Lysandre committed
132
TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
133
134
~~~~~~~~~~~~~~~~~~~~~~~~~~

135
.. autoclass:: transformers.TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
136
137
138
    :members:


Lysandre's avatar
Lysandre committed
139
TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
140
141
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

142
.. autoclass:: transformers.TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
143
144
145
    :members:


Lysandre's avatar
Lysandre committed
146
TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
147
148
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

149
.. autoclass:: transformers.TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
150
151
152
    :members:


Lysandre's avatar
Lysandre committed
153
TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
154
155
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

156
.. autoclass:: transformers.TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
157
158
159
    :members:


Lysandre's avatar
Lysandre committed
160
TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
161
162
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

163
.. autoclass:: transformers.TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
164
165
166
    :members:


Lysandre's avatar
Lysandre committed
167
TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
168
169
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

170
.. autoclass:: transformers.TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
171
172
    :members: