xlnet.rst 7.82 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
.. 
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

13
XLNet
Sylvain Gugger's avatar
Sylvain Gugger committed
14
-----------------------------------------------------------------------------------------------------------------------
15

Lysandre's avatar
Lysandre committed
16
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
18

Sylvain Gugger's avatar
Sylvain Gugger committed
19
20
21
22
23
The XLNet model was proposed in `XLNet: Generalized Autoregressive Pretraining for Language Understanding
<https://arxiv.org/abs/1906.08237>`_ by Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov,
Quoc V. Le. XLnet is an extension of the Transformer-XL model pre-trained using an autoregressive method to learn
bidirectional contexts by maximizing the expected likelihood over all permutations of the input sequence factorization
order.
Lysandre's avatar
Lysandre committed
24

Lysandre's avatar
Lysandre committed
25
26
27
28
29
30
The abstract from the paper is the following:

*With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves
better performance than pretraining approaches based on autoregressive language modeling. However, relying on
corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a
pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive
Sylvain Gugger's avatar
Sylvain Gugger committed
31
32
33
34
35
pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all
permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive
formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into
pretraining. Empirically, under comparable experiment settings, XLNet outperforms BERT on 20 tasks, often by a large
margin, including question answering, natural language inference, sentiment analysis, and document ranking.*
Lysandre's avatar
Lysandre committed
36

Lysandre's avatar
Lysandre committed
37
Tips:
Lysandre's avatar
Lysandre committed
38

Sylvain Gugger's avatar
Sylvain Gugger committed
39
40
41
42
43
44
- The specific attention pattern can be controlled at training and test time using the :obj:`perm_mask` input.
- Due to the difficulty of training a fully auto-regressive model over various factorization order, XLNet is pretrained
  using only a sub-set of the output tokens as target which are selected with the :obj:`target_mapping` input.
- To use XLNet for sequential decoding (i.e. not in fully bi-directional setting), use the :obj:`perm_mask` and
  :obj:`target_mapping` inputs to control the attention span and outputs (see examples in
  `examples/text-generation/run_generation.py`)
Lysandre's avatar
Lysandre committed
45
- XLNet is one of the few models that has no sequence length limit.
Lysandre's avatar
Lysandre committed
46

Sylvain Gugger's avatar
Sylvain Gugger committed
47
The original code can be found `here <https://github.com/zihangdai/xlnet/>`__.
48

Lysandre's avatar
Lysandre committed
49

Lysandre's avatar
Lysandre committed
50
XLNetConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
51
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52

53
.. autoclass:: transformers.XLNetConfig
54
55
56
    :members:


Lysandre's avatar
Lysandre committed
57
XLNetTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
58
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59

60
.. autoclass:: transformers.XLNetTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
61
62
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
63
64


65
XLNet specific outputs
Sylvain Gugger's avatar
Sylvain Gugger committed
66
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
67

Sylvain Gugger's avatar
Sylvain Gugger committed
68
.. autoclass:: transformers.models.xlnet.modeling_xlnet.XLNetModelOutput
69
70
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
71
.. autoclass:: transformers.models.xlnet.modeling_xlnet.XLNetLMHeadModelOutput
72
73
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
74
.. autoclass:: transformers.models.xlnet.modeling_xlnet.XLNetForSequenceClassificationOutput
75
76
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
77
.. autoclass:: transformers.models.xlnet.modeling_xlnet.XLNetForMultipleChoiceOutput
78
79
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
80
.. autoclass:: transformers.models.xlnet.modeling_xlnet.XLNetForTokenClassificationOutput
81
82
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
83
.. autoclass:: transformers.models.xlnet.modeling_xlnet.XLNetForQuestionAnsweringSimpleOutput
84
85
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
86
.. autoclass:: transformers.models.xlnet.modeling_xlnet.XLNetForQuestionAnsweringOutput
87
88
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
89
.. autoclass:: transformers.models.xlnet.modeling_tf_xlnet.TFXLNetModelOutput
Sylvain Gugger's avatar
Sylvain Gugger committed
90
91
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
92
.. autoclass:: transformers.models.xlnet.modeling_tf_xlnet.TFXLNetLMHeadModelOutput
Sylvain Gugger's avatar
Sylvain Gugger committed
93
94
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
95
.. autoclass:: transformers.models.xlnet.modeling_tf_xlnet.TFXLNetForSequenceClassificationOutput
Sylvain Gugger's avatar
Sylvain Gugger committed
96
97
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
98
.. autoclass:: transformers.models.xlnet.modeling_tf_xlnet.TFXLNetForMultipleChoiceOutput
Sylvain Gugger's avatar
Sylvain Gugger committed
99
100
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
101
.. autoclass:: transformers.models.xlnet.modeling_tf_xlnet.TFXLNetForTokenClassificationOutput
Sylvain Gugger's avatar
Sylvain Gugger committed
102
103
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
104
.. autoclass:: transformers.models.xlnet.modeling_tf_xlnet.TFXLNetForQuestionAnsweringSimpleOutput
Sylvain Gugger's avatar
Sylvain Gugger committed
105
106
    :members:

107

Lysandre's avatar
Lysandre committed
108
XLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
109
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110

111
.. autoclass:: transformers.XLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
112
    :members: forward
113
114


Lysandre's avatar
Lysandre committed
115
XLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
116
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
117

118
.. autoclass:: transformers.XLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
119
    :members: forward
120
121


Lysandre's avatar
Lysandre committed
122
XLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
123
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124

125
.. autoclass:: transformers.XLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
126
    :members: forward
127
128


Sylvain Gugger's avatar
Sylvain Gugger committed
129
XLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
130
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
131

Sylvain Gugger's avatar
Sylvain Gugger committed
132
.. autoclass:: transformers.XLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
133
    :members: forward
Lysandre's avatar
Lysandre committed
134
135


Sylvain Gugger's avatar
Sylvain Gugger committed
136
XLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
137
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
138

Sylvain Gugger's avatar
Sylvain Gugger committed
139
.. autoclass:: transformers.XLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
140
    :members: forward
Lysandre's avatar
Lysandre committed
141
142


Lysandre's avatar
Lysandre committed
143
XLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
144
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
145
146

.. autoclass:: transformers.XLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
147
    :members: forward
Lysandre's avatar
Lysandre committed
148
149


Lysandre's avatar
Lysandre committed
150
XLNetForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
151
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152

153
.. autoclass:: transformers.XLNetForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
154
    :members: forward
LysandreJik's avatar
LysandreJik committed
155
156


Lysandre's avatar
Lysandre committed
157
TFXLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
158
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
159

160
.. autoclass:: transformers.TFXLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
161
    :members: call
LysandreJik's avatar
LysandreJik committed
162
163


Lysandre's avatar
Lysandre committed
164
TFXLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
165
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
166

167
.. autoclass:: transformers.TFXLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
168
    :members: call
LysandreJik's avatar
LysandreJik committed
169
170


Lysandre's avatar
Lysandre committed
171
TFXLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
172
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
173

174
.. autoclass:: transformers.TFXLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
175
    :members: call
LysandreJik's avatar
LysandreJik committed
176
177


Sylvain Gugger's avatar
Sylvain Gugger committed
178
TFLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
179
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
180
181

.. autoclass:: transformers.TFXLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
182
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
183
184
185


TFXLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
186
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
187
188

.. autoclass:: transformers.TFXLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
189
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
190
191


Lysandre's avatar
Lysandre committed
192
TFXLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
193
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
194

195
.. autoclass:: transformers.TFXLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
196
    :members: call