"tests/test_modeling_roformer.py" did not exist on "075fdab4fe04ff03ca147dff01b198fd2ef570a7"
distilbert.rst 7.96 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
.. 
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

LysandreJik's avatar
LysandreJik committed
13
DistilBERT
Sylvain Gugger's avatar
Sylvain Gugger committed
14
-----------------------------------------------------------------------------------------------------------------------
LysandreJik's avatar
LysandreJik committed
15

Sylvain Gugger's avatar
Sylvain Gugger committed
16
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
18

Sylvain Gugger's avatar
Sylvain Gugger committed
19
20
21
22
23
24
The DistilBERT model was proposed in the blog post `Smaller, faster, cheaper, lighter: Introducing DistilBERT, a
distilled version of BERT <https://medium.com/huggingface/distilbert-8cf3380435b5>`__, and the paper `DistilBERT, a
distilled version of BERT: smaller, faster, cheaper and lighter <https://arxiv.org/abs/1910.01108>`__. DistilBERT is a
small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than
`bert-base-uncased`, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language
understanding benchmark.
Lysandre's avatar
Lysandre committed
25
26
27
28
29
30
31

The abstract from the paper is the following:

*As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP),
operating these large models in on-the-edge and/or under constrained computational training or inference budgets
remains challenging. In this work, we propose a method to pre-train a smaller general-purpose language representation
model, called DistilBERT, which can then be fine-tuned with good performances on a wide range of tasks like its larger
Sylvain Gugger's avatar
Sylvain Gugger committed
32
counterparts. While most prior work investigated the use of distillation for building task-specific models, we leverage
33
knowledge distillation during the pretraining phase and show that it is possible to reduce the size of a BERT model by
Sylvain Gugger's avatar
Sylvain Gugger committed
34
40%, while retaining 97% of its language understanding capabilities and being 60% faster. To leverage the inductive
35
biases learned by larger models during pretraining, we introduce a triple loss combining language modeling,
Sylvain Gugger's avatar
Sylvain Gugger committed
36
37
38
distillation and cosine-distance losses. Our smaller, faster and lighter model is cheaper to pre-train and we
demonstrate its capabilities for on-device computations in a proof-of-concept experiment and a comparative on-device
study.*
Lysandre's avatar
Lysandre committed
39
40

Tips:
Lysandre's avatar
Lysandre committed
41

Sylvain Gugger's avatar
Sylvain Gugger committed
42
43
44
45
- DistilBERT doesn't have :obj:`token_type_ids`, you don't need to indicate which token belongs to which segment. Just
  separate your segments with the separation token :obj:`tokenizer.sep_token` (or :obj:`[SEP]`).
- DistilBERT doesn't have options to select the input positions (:obj:`position_ids` input). This could be added if
  necessary though, just let us know if you need this option.
Lysandre's avatar
Lysandre committed
46

Kamal Raj's avatar
Kamal Raj committed
47
48
49
This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. This model jax version was
contributed by `kamalkraj <https://huggingface.co/kamalkraj>`__. The original code can be found :prefix_link:`here
<examples/research-projects/distillation>`.
50

Lysandre's avatar
Lysandre committed
51

Lysandre's avatar
Lysandre committed
52
DistilBertConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
53
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
54

55
.. autoclass:: transformers.DistilBertConfig
LysandreJik's avatar
LysandreJik committed
56
57
58
    :members:


Lysandre's avatar
Lysandre committed
59
DistilBertTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
60
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
61

62
.. autoclass:: transformers.DistilBertTokenizer
LysandreJik's avatar
LysandreJik committed
63
64
65
    :members:


66
DistilBertTokenizerFast
Sylvain Gugger's avatar
Sylvain Gugger committed
67
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68
69
70
71
72

.. autoclass:: transformers.DistilBertTokenizerFast
    :members:


Lysandre's avatar
Lysandre committed
73
DistilBertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
74
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
75

76
.. autoclass:: transformers.DistilBertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
77
    :members: forward
LysandreJik's avatar
LysandreJik committed
78
79


Lysandre's avatar
Lysandre committed
80
DistilBertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
81
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
82

83
.. autoclass:: transformers.DistilBertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
84
    :members: forward
LysandreJik's avatar
LysandreJik committed
85
86


Lysandre's avatar
Lysandre committed
87
DistilBertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
88
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
89

90
.. autoclass:: transformers.DistilBertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
91
    :members: forward
LysandreJik's avatar
LysandreJik committed
92
93


94
DistilBertForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
95
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
96
97

.. autoclass:: transformers.DistilBertForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
98
    :members: forward
99
100


Sylvain Gugger's avatar
Sylvain Gugger committed
101
DistilBertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
102
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
103
104

.. autoclass:: transformers.DistilBertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
105
    :members: forward
Sylvain Gugger's avatar
Sylvain Gugger committed
106
107


Lysandre's avatar
Lysandre committed
108
DistilBertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
109
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
110

111
.. autoclass:: transformers.DistilBertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
112
    :members: forward
LysandreJik's avatar
LysandreJik committed
113

Lysandre's avatar
Lysandre committed
114
TFDistilBertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
115
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
116

117
.. autoclass:: transformers.TFDistilBertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
118
    :members: call
LysandreJik's avatar
LysandreJik committed
119
120


Lysandre's avatar
Lysandre committed
121
TFDistilBertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
122
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
123

124
.. autoclass:: transformers.TFDistilBertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
125
    :members: call
LysandreJik's avatar
LysandreJik committed
126
127


Lysandre's avatar
Lysandre committed
128
TFDistilBertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
129
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
130

131
.. autoclass:: transformers.TFDistilBertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
132
    :members: call
LysandreJik's avatar
LysandreJik committed
133
134


Sylvain Gugger's avatar
Sylvain Gugger committed
135
136

TFDistilBertForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
137
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
138
139

.. autoclass:: transformers.TFDistilBertForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
140
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
141
142
143
144



TFDistilBertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
145
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
146
147

.. autoclass:: transformers.TFDistilBertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
148
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
149
150


Lysandre's avatar
Lysandre committed
151
TFDistilBertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
152
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
153

154
.. autoclass:: transformers.TFDistilBertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
155
    :members: call
Kamal Raj's avatar
Kamal Raj committed
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197


FlaxDistilBertModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxDistilBertModel
    :members: __call__


FlaxDistilBertForMaskedLM
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxDistilBertForMaskedLM
    :members: __call__


FlaxDistilBertForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxDistilBertForSequenceClassification
    :members: __call__


FlaxDistilBertForMultipleChoice
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxDistilBertForMultipleChoice
    :members: __call__


FlaxDistilBertForTokenClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxDistilBertForTokenClassification
    :members: __call__


FlaxDistilBertForQuestionAnswering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxDistilBertForQuestionAnswering
    :members: __call__