albert.rst 5.91 KB
Newer Older
Lysandre's avatar
Lysandre committed
1
ALBERT
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
Lysandre's avatar
Lysandre committed
3

4
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
5
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6

Sylvain Gugger's avatar
Sylvain Gugger committed
7
8
9
10
The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
<https://arxiv.org/abs/1909.11942>`__ by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma,
Radu Soricut. It presents two parameter-reduction techniques to lower memory consumption and increase the training
speed of BERT:
11

Sylvain Gugger's avatar
Sylvain Gugger committed
12
13
- Splitting the embedding matrix into two smaller matrices.
- Using repeating layers split among groups.
14
15
16
17
18
19
20
21

The abstract from the paper is the following:

*Increasing model size when pretraining natural language representations often results in improved performance on
downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations,
longer training times, and unexpected model degradation. To address these problems, we present two parameter-reduction
techniques to lower memory consumption and increase the training speed of BERT. Comprehensive empirical evidence shows
that our proposed methods lead to models that scale much better compared to the original BERT. We also use a
Sylvain Gugger's avatar
Sylvain Gugger committed
22
23
24
self-supervised loss that focuses on modeling inter-sentence coherence, and show it consistently helps downstream tasks
with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and
SQuAD benchmarks while having fewer parameters compared to BERT-large.*
25
26
27

Tips:

Sylvain Gugger's avatar
Sylvain Gugger committed
28
29
- ALBERT is a model with absolute position embeddings so it's usually advised to pad the inputs on the right rather
  than the left.
30
31
32
33
- ALBERT uses repeating layers which results in a small memory footprint, however the computational cost remains
  similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same
  number of (repeating) layers.

Sylvain Gugger's avatar
Sylvain Gugger committed
34
The original code can be found `here <https://github.com/google-research/ALBERT>`__.
35

36
AlbertConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
37
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
38
39
40
41
42

.. autoclass:: transformers.AlbertConfig
    :members:


43
AlbertTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
44
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
45
46

.. autoclass:: transformers.AlbertTokenizer
Lysandre Debut's avatar
Lysandre Debut committed
47
48
    :members: build_inputs_with_special_tokens, get_special_tokens_mask,
        create_token_type_ids_from_sequences, save_vocabulary
Lysandre's avatar
Lysandre committed
49
50


51
Albert specific outputs
Sylvain Gugger's avatar
Sylvain Gugger committed
52
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53

Sylvain Gugger's avatar
Sylvain Gugger committed
54
55
56
57
.. autoclass:: transformers.modeling_albert.AlbertForPreTrainingOutput
    :members:

.. autoclass:: transformers.modeling_tf_albert.TFAlbertForPreTrainingOutput
58
59
60
    :members:


61
AlbertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
62
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
63
64

.. autoclass:: transformers.AlbertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
65
    :members: forward
Lysandre's avatar
Lysandre committed
66
67


68
AlbertForPreTraining
Sylvain Gugger's avatar
Sylvain Gugger committed
69
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
70
71

.. autoclass:: transformers.AlbertForPreTraining
Sylvain Gugger's avatar
Sylvain Gugger committed
72
    :members: forward
73
74


75
AlbertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
76
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
77
78

.. autoclass:: transformers.AlbertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
79
    :members: forward
Lysandre's avatar
Lysandre committed
80
81


82
AlbertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
83
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
84
85

.. autoclass:: transformers.AlbertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
86
    :members: forward
Lysandre's avatar
Lysandre committed
87

88
89

AlbertForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
90
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91
92
93
94
95

.. autoclass:: transformers.AlbertForMultipleChoice
    :members:


Sylvain Gugger's avatar
Sylvain Gugger committed
96
AlbertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
97
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
98
99

.. autoclass:: transformers.AlbertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
100
    :members: forward
Sylvain Gugger's avatar
Sylvain Gugger committed
101

Lysandre's avatar
Lysandre committed
102

103
AlbertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
104
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
105
106

.. autoclass:: transformers.AlbertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
107
    :members: forward
Lysandre's avatar
Lysandre committed
108
109


110
TFAlbertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
111
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
112
113

.. autoclass:: transformers.TFAlbertModel
Sylvain Gugger's avatar
Sylvain Gugger committed
114
    :members: call
Lysandre's avatar
Lysandre committed
115
116


117
TFAlbertForPreTraining
Sylvain Gugger's avatar
Sylvain Gugger committed
118
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119
120

.. autoclass:: transformers.TFAlbertForPreTraining
Sylvain Gugger's avatar
Sylvain Gugger committed
121
    :members: call
122
123


124
TFAlbertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
125
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
126
127

.. autoclass:: transformers.TFAlbertForMaskedLM
Sylvain Gugger's avatar
Sylvain Gugger committed
128
    :members: call
Lysandre's avatar
Lysandre committed
129
130


131
TFAlbertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
132
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre's avatar
Lysandre committed
133
134

.. autoclass:: transformers.TFAlbertForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
135
    :members: call
136
137
138


TFAlbertForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
139
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140
141

.. autoclass:: transformers.TFAlbertForMultipleChoice
Sylvain Gugger's avatar
Sylvain Gugger committed
142
    :members: call
143
144


Sylvain Gugger's avatar
Sylvain Gugger committed
145
TFAlbertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
146
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
147
148

.. autoclass:: transformers.TFAlbertForTokenClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
149
    :members: call
Sylvain Gugger's avatar
Sylvain Gugger committed
150
151


152
TFAlbertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
153
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
154
155

.. autoclass:: transformers.TFAlbertForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
156
    :members: call