bert.rst 4.46 KB
Newer Older
1
2
3
BERT
----------------------------------------------------

Lysandre's avatar
Lysandre committed
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Overview
~~~~~~~~~~~~~~~~~~~~~

The BERT model was proposed in `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`__
by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It's a bidirectional transformer
pre-trained using a combination of masked language modeling objective and next sentence prediction
on a large corpus comprising the Toronto Book Corpus and Wikipedia.

The abstract from the paper is the following:

*We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations
from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional
representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result,
the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models
for a wide range of tasks, such as question answering and language inference, without substantial task-specific
architecture modifications.*

*BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural
language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI
accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute
improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).*

Tips:

- BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
  the right rather than the left.
30
31
32
33
34
35
36
- BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
  tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
  modeling (CLM) objective are better in that regard.
- Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
  approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
  prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
  the [CLS] token.
Lysandre's avatar
Lysandre committed
37
38

BertConfig
39
40
~~~~~~~~~~~~~~~~~~~~~

41
.. autoclass:: transformers.BertConfig
42
    :members:
43
44


Lysandre's avatar
Lysandre committed
45
BertTokenizer
46
~~~~~~~~~~~~~~~~~~~~~
47

48
.. autoclass:: transformers.BertTokenizer
49
    :members:
50
51


Lysandre's avatar
Lysandre committed
52
BertModel
53
54
~~~~~~~~~~~~~~~~~~~~

55
.. autoclass:: transformers.BertModel
56
57
58
    :members:


Lysandre's avatar
Lysandre committed
59
BertForPreTraining
60
61
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

62
.. autoclass:: transformers.BertForPreTraining
63
64
65
    :members:


Lysandre's avatar
Lysandre committed
66
BertForMaskedLM
67
68
~~~~~~~~~~~~~~~~~~~~~~~~~~

69
.. autoclass:: transformers.BertForMaskedLM
70
71
72
    :members:


Lysandre's avatar
Lysandre committed
73
BertForNextSentencePrediction
74
75
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

76
.. autoclass:: transformers.BertForNextSentencePrediction
77
78
79
    :members:


Lysandre's avatar
Lysandre committed
80
BertForSequenceClassification
81
82
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

83
.. autoclass:: transformers.BertForSequenceClassification
84
85
86
    :members:


Lysandre's avatar
Lysandre committed
87
BertForMultipleChoice
88
89
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

90
.. autoclass:: transformers.BertForMultipleChoice
91
92
93
    :members:


Lysandre's avatar
Lysandre committed
94
BertForTokenClassification
95
96
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

97
.. autoclass:: transformers.BertForTokenClassification
98
99
100
    :members:


Lysandre's avatar
Lysandre committed
101
BertForQuestionAnswering
102
103
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

104
.. autoclass:: transformers.BertForQuestionAnswering
105
106
    :members:

LysandreJik's avatar
LysandreJik committed
107

Lysandre's avatar
Lysandre committed
108
TFBertModel
LysandreJik's avatar
LysandreJik committed
109
110
~~~~~~~~~~~~~~~~~~~~

111
.. autoclass:: transformers.TFBertModel
LysandreJik's avatar
LysandreJik committed
112
113
114
    :members:


Lysandre's avatar
Lysandre committed
115
TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
116
117
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

118
.. autoclass:: transformers.TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
119
120
121
    :members:


Lysandre's avatar
Lysandre committed
122
TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
123
124
~~~~~~~~~~~~~~~~~~~~~~~~~~

125
.. autoclass:: transformers.TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
126
127
128
    :members:


Lysandre's avatar
Lysandre committed
129
TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
130
131
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

132
.. autoclass:: transformers.TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
133
134
135
    :members:


Lysandre's avatar
Lysandre committed
136
TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
137
138
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

139
.. autoclass:: transformers.TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
140
141
142
    :members:


Lysandre's avatar
Lysandre committed
143
TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
144
145
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

146
.. autoclass:: transformers.TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
147
148
149
    :members:


Lysandre's avatar
Lysandre committed
150
TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
151
152
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

153
.. autoclass:: transformers.TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
154
155
156
    :members:


Lysandre's avatar
Lysandre committed
157
TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
158
159
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

160
.. autoclass:: transformers.TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
161
162
    :members: