xlnet.rst 7.06 KB
Newer Older
1
XLNet
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
3

Lysandre's avatar
Lysandre committed
4
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
5
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
6

Sylvain Gugger's avatar
Sylvain Gugger committed
7
8
9
10
11
The XLNet model was proposed in `XLNet: Generalized Autoregressive Pretraining for Language Understanding
<https://arxiv.org/abs/1906.08237>`_ by Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov,
Quoc V. Le. XLnet is an extension of the Transformer-XL model pre-trained using an autoregressive method to learn
bidirectional contexts by maximizing the expected likelihood over all permutations of the input sequence factorization
order.
Lysandre's avatar
Lysandre committed
12

Lysandre's avatar
Lysandre committed
13
14
15
16
17
18
19
20
21
22
23
The abstract from the paper is the following:

*With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves
better performance than pretraining approaches based on autoregressive language modeling. However, relying on
corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a
pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive
pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over
all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive
formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model,
into pretraining. Empirically, under comparable experiment settings, XLNet outperforms BERT on 20 tasks, often by
a large margin, including question answering, natural language inference, sentiment analysis, and document ranking.*
Lysandre's avatar
Lysandre committed
24

Lysandre's avatar
Lysandre committed
25
Tips:
Lysandre's avatar
Lysandre committed
26

Sylvain Gugger's avatar
Sylvain Gugger committed
27
28
29
30
31
32
- The specific attention pattern can be controlled at training and test time using the :obj:`perm_mask` input.
- Due to the difficulty of training a fully auto-regressive model over various factorization order, XLNet is pretrained
  using only a sub-set of the output tokens as target which are selected with the :obj:`target_mapping` input.
- To use XLNet for sequential decoding (i.e. not in fully bi-directional setting), use the :obj:`perm_mask` and
  :obj:`target_mapping` inputs to control the attention span and outputs (see examples in
  `examples/text-generation/run_generation.py`)
Lysandre's avatar
Lysandre committed
33
- XLNet is one of the few models that has no sequence length limit.
Lysandre's avatar
Lysandre committed
34

Sylvain Gugger's avatar
Sylvain Gugger committed
35
The original code can be found `here <https://github.com/zihangdai/xlnet/>`__.
36

Lysandre's avatar
Lysandre committed
37

Lysandre's avatar
Lysandre committed
38
XLNetConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
39
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40

41
.. autoclass:: transformers.XLNetConfig
42
43
44
    :members:


Lysandre's avatar
Lysandre committed
45
XLNetTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
46
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47

48
.. autoclass:: transformers.XLNetTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
49
50
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
51
52


53
XLNet specific outputs
Sylvain Gugger's avatar
Sylvain Gugger committed
54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76

.. autoclass:: transformers.modeling_xlnet.XLNetModelOutput
    :members:

.. autoclass:: transformers.modeling_xlnet.XLNetLMHeadModelOutput
    :members:

.. autoclass:: transformers.modeling_xlnet.XLNetForSequenceClassificationOutput
    :members:

.. autoclass:: transformers.modeling_xlnet.XLNetForMultipleChoiceOutput
    :members:

.. autoclass:: transformers.modeling_xlnet.XLNetForTokenClassificationOutput
    :members:

.. autoclass:: transformers.modeling_xlnet.XLNetForQuestionAnsweringSimpleOutput
    :members:

.. autoclass:: transformers.modeling_xlnet.XLNetForQuestionAnsweringOutput
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
.. autoclass:: transformers.modeling_tf_xlnet.TFXLNetModelOutput
    :members:

.. autoclass:: transformers.modeling_tf_xlnet.TFXLNetLMHeadModelOutput
    :members:

.. autoclass:: transformers.modeling_tf_xlnet.TFXLNetForSequenceClassificationOutput
    :members:

.. autoclass:: transformers.modeling_tf_xlnet.TFXLNetForMultipleChoiceOutput
    :members:

.. autoclass:: transformers.modeling_tf_xlnet.TFXLNetForTokenClassificationOutput
    :members:

.. autoclass:: transformers.modeling_tf_xlnet.TFXLNetForQuestionAnsweringSimpleOutput
    :members:

95

Lysandre's avatar
Lysandre committed
96
XLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
97
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
98

99
.. autoclass:: transformers.XLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
100
    :members: forward
101
102


Lysandre's avatar
Lysandre committed
103
XLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
104
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
105

106
.. autoclass:: transformers.XLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
107
    :members: forward
108
109


Lysandre's avatar
Lysandre committed
110
XLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
111
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
112

113
.. autoclass:: transformers.XLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
114
    :members: forward
115
116


Sylvain Gugger's avatar
Sylvain Gugger committed
117
XLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
118
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
119

Sylvain Gugger's avatar
Sylvain Gugger committed
120
.. autoclass:: transformers.XLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
121
    :members: forward
Lysandre's avatar
Lysandre committed
122
123


Sylvain Gugger's avatar
Sylvain Gugger committed
124
XLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
125
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
126

Sylvain Gugger's avatar
Sylvain Gugger committed
127
.. autoclass:: transformers.XLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
128
    :members: forward
Lysandre's avatar
Lysandre committed
129
130


Lysandre's avatar
Lysandre committed
131
XLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
132
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
133
134

.. autoclass:: transformers.XLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
135
    :members: forward
Lysandre's avatar
Lysandre committed
136
137


Lysandre's avatar
Lysandre committed
138
XLNetForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
139
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140

141
.. autoclass:: transformers.XLNetForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
142
    :members: forward
LysandreJik's avatar
LysandreJik committed
143
144


Lysandre's avatar
Lysandre committed
145
TFXLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
146
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
147

148
.. autoclass:: transformers.TFXLNetModel
Sylvain Gugger's avatar
Sylvain Gugger committed
149
    :members: call
LysandreJik's avatar
LysandreJik committed
150
151


Lysandre's avatar
Lysandre committed
152
TFXLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
153
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
154

155
.. autoclass:: transformers.TFXLNetLMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
156
    :members: call
LysandreJik's avatar
LysandreJik committed
157
158


Lysandre's avatar
Lysandre committed
159
TFXLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
160
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
161

162
.. autoclass:: transformers.TFXLNetForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
163
    :members: call
LysandreJik's avatar
LysandreJik committed
164
165


Sylvain Gugger's avatar
Sylvain Gugger committed
166
TFLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
167
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
168
169

.. autoclass:: transformers.TFXLNetForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
170
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
171
172
173


TFXLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
174
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
175
176

.. autoclass:: transformers.TFXLNetForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
177
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
178
179


Lysandre's avatar
Lysandre committed
180
TFXLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
181
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
182

183
.. autoclass:: transformers.TFXLNetForQuestionAnsweringSimple
Sylvain Gugger's avatar
Sylvain Gugger committed
184
    :members: call