bert.rst 4.58 KB
Newer Older
1
2
3
BERT
----------------------------------------------------

Lysandre's avatar
Lysandre committed
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Overview
~~~~~~~~~~~~~~~~~~~~~

The BERT model was proposed in `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`__
by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It's a bidirectional transformer
pre-trained using a combination of masked language modeling objective and next sentence prediction
on a large corpus comprising the Toronto Book Corpus and Wikipedia.

The abstract from the paper is the following:

*We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations
from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional
representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result,
the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models
for a wide range of tasks, such as question answering and language inference, without substantial task-specific
architecture modifications.*

*BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural
language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI
accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute
improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).*

Tips:

- BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
  the right rather than the left.
30
31
32
33
34
35
36
- BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
  tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
  modeling (CLM) objective are better in that regard.
- Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
  approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
  prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
  the [CLS] token.
Lysandre's avatar
Lysandre committed
37
38

BertConfig
39
40
~~~~~~~~~~~~~~~~~~~~~

41
.. autoclass:: transformers.BertConfig
42
    :members:
43
44


Lysandre's avatar
Lysandre committed
45
BertTokenizer
46
~~~~~~~~~~~~~~~~~~~~~
47

48
.. autoclass:: transformers.BertTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
49
50
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
51
52


Lysandre's avatar
Lysandre committed
53
BertModel
54
55
~~~~~~~~~~~~~~~~~~~~

56
.. autoclass:: transformers.BertModel
57
58
59
    :members:


Lysandre's avatar
Lysandre committed
60
BertForPreTraining
61
62
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

63
.. autoclass:: transformers.BertForPreTraining
64
65
66
    :members:


Lysandre's avatar
Lysandre committed
67
BertForMaskedLM
68
69
~~~~~~~~~~~~~~~~~~~~~~~~~~

70
.. autoclass:: transformers.BertForMaskedLM
71
72
73
    :members:


Lysandre's avatar
Lysandre committed
74
BertForNextSentencePrediction
75
76
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

77
.. autoclass:: transformers.BertForNextSentencePrediction
78
79
80
    :members:


Lysandre's avatar
Lysandre committed
81
BertForSequenceClassification
82
83
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

84
.. autoclass:: transformers.BertForSequenceClassification
85
86
87
    :members:


Lysandre's avatar
Lysandre committed
88
BertForMultipleChoice
89
90
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

91
.. autoclass:: transformers.BertForMultipleChoice
92
93
94
    :members:


Lysandre's avatar
Lysandre committed
95
BertForTokenClassification
96
97
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

98
.. autoclass:: transformers.BertForTokenClassification
99
100
101
    :members:


Lysandre's avatar
Lysandre committed
102
BertForQuestionAnswering
103
104
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

105
.. autoclass:: transformers.BertForQuestionAnswering
106
107
    :members:

LysandreJik's avatar
LysandreJik committed
108

Lysandre's avatar
Lysandre committed
109
TFBertModel
LysandreJik's avatar
LysandreJik committed
110
111
~~~~~~~~~~~~~~~~~~~~

112
.. autoclass:: transformers.TFBertModel
LysandreJik's avatar
LysandreJik committed
113
114
115
    :members:


Lysandre's avatar
Lysandre committed
116
TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
117
118
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

119
.. autoclass:: transformers.TFBertForPreTraining
LysandreJik's avatar
LysandreJik committed
120
121
122
    :members:


Lysandre's avatar
Lysandre committed
123
TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
124
125
~~~~~~~~~~~~~~~~~~~~~~~~~~

126
.. autoclass:: transformers.TFBertForMaskedLM
LysandreJik's avatar
LysandreJik committed
127
128
129
    :members:


Lysandre's avatar
Lysandre committed
130
TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
131
132
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

133
.. autoclass:: transformers.TFBertForNextSentencePrediction
LysandreJik's avatar
LysandreJik committed
134
135
136
    :members:


Lysandre's avatar
Lysandre committed
137
TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
138
139
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

140
.. autoclass:: transformers.TFBertForSequenceClassification
LysandreJik's avatar
LysandreJik committed
141
142
143
    :members:


Lysandre's avatar
Lysandre committed
144
TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
145
146
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

147
.. autoclass:: transformers.TFBertForMultipleChoice
LysandreJik's avatar
LysandreJik committed
148
149
150
    :members:


Lysandre's avatar
Lysandre committed
151
TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
152
153
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

154
.. autoclass:: transformers.TFBertForTokenClassification
LysandreJik's avatar
LysandreJik committed
155
156
157
    :members:


Lysandre's avatar
Lysandre committed
158
TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
159
160
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

161
.. autoclass:: transformers.TFBertForQuestionAnswering
LysandreJik's avatar
LysandreJik committed
162
163
    :members: