gpt2.rst 4.7 KB
Newer Older
1
OpenAI GPT2
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
3

4
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
5
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6

Sylvain Gugger's avatar
Sylvain Gugger committed
7
8
9
10
OpenAI GPT-2 model was proposed in `Language Models are Unsupervised Multitask Learners
<https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf>`_
by Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever. It's a causal (unidirectional)
transformer pretrained using  language modeling on a very large corpus of ~40 GB of text data.
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

The abstract from the paper is the following:

*GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset[1]
of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous
words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring
demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X
the parameters and trained on more than 10X the amount of data.*

Tips:

- GPT-2 is a model with absolute position embeddings so it's usually advised to pad the inputs on
  the right rather than the left.
- GPT-2 was trained with a causal language modeling (CLM) objective and is therefore powerful at predicting the next
  token in a sequence. Leveraging this feature allows GPT-2 to generate syntactically coherent text as
  it can be observed in the `run_generation.py` example script.
- The PyTorch models can take the `past` as input, which is the previously computed key/value attention pairs. Using
  this `past` value prevents the model from re-computing pre-computed values in the context of text generation.
Sylvain Gugger's avatar
Sylvain Gugger committed
29
  See `reusing the past in generative models <../quickstart.html#using-the-past>`__ for more information on the usage
30
31
  of this argument.

Lysandre's avatar
TF GPT2  
Lysandre committed
32
33
`Write With Transformer <https://transformer.huggingface.co/doc/gpt2-large>`__ is a webapp created and hosted by
Hugging Face showcasing the generative capabilities of several models. GPT-2 is one of them and is available in five
Sylvain Gugger's avatar
Sylvain Gugger committed
34
different sizes: small, medium, large, xl and a distilled version of the small checkpoint: `distilgpt-2`.
Lysandre's avatar
TF GPT2  
Lysandre committed
35

Sylvain Gugger's avatar
Sylvain Gugger committed
36
The original code can be found `here <https://openai.com/blog/better-language-models/>`__.
37

38

Lysandre's avatar
Lysandre committed
39
GPT2Config
Sylvain Gugger's avatar
Sylvain Gugger committed
40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41

42
.. autoclass:: transformers.GPT2Config
43
    :members:
44
45


Lysandre's avatar
Lysandre committed
46
GPT2Tokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
47
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48

49
.. autoclass:: transformers.GPT2Tokenizer
Lysandre Debut's avatar
Lysandre Debut committed
50
    :members: save_vocabulary
51
52


53
GPT2TokenizerFast
Sylvain Gugger's avatar
Sylvain Gugger committed
54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55
56
57
58
59

.. autoclass:: transformers.GPT2TokenizerFast
    :members:


60
GPT2 specific outputs
Sylvain Gugger's avatar
Sylvain Gugger committed
61
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
62
63
64
65

.. autoclass:: transformers.modeling_gpt2.GPT2DoubleHeadsModelOutput
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
66
67
68
.. autoclass:: transformers.modeling_tf_gpt2.TFGPT2DoubleHeadsModelOutput
    :members:

69

Lysandre's avatar
Lysandre committed
70
GPT2Model
Sylvain Gugger's avatar
Sylvain Gugger committed
71
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
72

73
.. autoclass:: transformers.GPT2Model
Sylvain Gugger's avatar
Sylvain Gugger committed
74
    :members: forward
75
76


Lysandre's avatar
Lysandre committed
77
GPT2LMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
78
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
79

80
.. autoclass:: transformers.GPT2LMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
81
    :members: forward
82
83


Lysandre's avatar
Lysandre committed
84
GPT2DoubleHeadsModel
Sylvain Gugger's avatar
Sylvain Gugger committed
85
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86

87
.. autoclass:: transformers.GPT2DoubleHeadsModel
Sylvain Gugger's avatar
Sylvain Gugger committed
88
    :members: forward
LysandreJik's avatar
LysandreJik committed
89
90


91
92
93
94
95
96
97
GPT2ForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.GPT2ForSequenceClassification
    :members: forward


Lysandre's avatar
Lysandre committed
98
TFGPT2Model
Sylvain Gugger's avatar
Sylvain Gugger committed
99
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
100

101
.. autoclass:: transformers.TFGPT2Model
Sylvain Gugger's avatar
Sylvain Gugger committed
102
    :members: call
LysandreJik's avatar
LysandreJik committed
103
104


Lysandre's avatar
Lysandre committed
105
TFGPT2LMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
106
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
107

108
.. autoclass:: transformers.TFGPT2LMHeadModel
Sylvain Gugger's avatar
Sylvain Gugger committed
109
    :members: call
LysandreJik's avatar
LysandreJik committed
110
111


Lysandre's avatar
Lysandre committed
112
TFGPT2DoubleHeadsModel
Sylvain Gugger's avatar
Sylvain Gugger committed
113
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
114

115
.. autoclass:: transformers.TFGPT2DoubleHeadsModel
Sylvain Gugger's avatar
Sylvain Gugger committed
116
    :members: call