"examples/seq2seq/bertabs/utils_summarization.py" did not exist on "693606a75c54d9731b748797f21961d0a5322896"
converting_tensorflow_models.rst 8.07 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
.. 
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

thomwolf's avatar
thomwolf committed
13
Converting Tensorflow Checkpoints
Sylvain Gugger's avatar
Sylvain Gugger committed
14
=======================================================================================================================
15

Sylvain Gugger's avatar
Sylvain Gugger committed
16
17
A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models
than be loaded using the ``from_pretrained`` methods of the library.
18

19
.. note::
Sylvain Gugger's avatar
Sylvain Gugger committed
20
21
    Since 2.3.0 the conversion script is now part of the transformers CLI (**transformers-cli**) available in any
    transformers >= 2.3.0 installation.
22
23
24

    The documentation below reflects the **transformers-cli convert** command format.

25
BERT
Sylvain Gugger's avatar
Sylvain Gugger committed
26
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27

Sylvain Gugger's avatar
Sylvain Gugger committed
28
29
You can convert any TensorFlow checkpoint for BERT (in particular `the pre-trained models released by Google
<https://github.com/google-research/bert#pre-trained-models>`_\ ) in a PyTorch save file by using the
30
31
:prefix_link:`convert_bert_original_tf_checkpoint_to_pytorch.py
<src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>` script.
Sylvain Gugger's avatar
Sylvain Gugger committed
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated
configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights
from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that
can be imported using ``torch.load()`` (see examples in `run_bert_extract_features.py
<https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_extract_features.py>`_\ ,
`run_bert_classifier.py
<https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_classifier.py>`_ and
`run_bert_squad.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_squad.py>`_\
).

You only need to run this conversion script **once** to get a PyTorch model. You can then disregard the TensorFlow
checkpoint (the three files starting with ``bert_model.ckpt``\ ) but be sure to keep the configuration file (\
``bert_config.json``\ ) and the vocabulary file (\ ``vocab.txt``\ ) as these are needed for the PyTorch model too.

To run this specific conversion script you will need to have TensorFlow and PyTorch installed (\ ``pip install
tensorflow``\ ). The rest of the repository only requires PyTorch.
49
50
51
52
53
54
55

Here is an example of the conversion process for a pre-trained ``BERT-Base Uncased`` model:

.. code-block:: shell

   export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12

56
57
58
59
   transformers-cli convert --model_type bert \
     --tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
     --config $BERT_BASE_DIR/bert_config.json \
     --pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin
60

Sylvain Gugger's avatar
Sylvain Gugger committed
61
62
You can download Google's pre-trained models for the conversion `here
<https://github.com/google-research/bert#pre-trained-models>`__.
63

64
ALBERT
Sylvain Gugger's avatar
Sylvain Gugger committed
65
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
66

Sylvain Gugger's avatar
Sylvain Gugger committed
67
Convert TensorFlow model checkpoints of ALBERT to PyTorch using the
68
69
:prefix_link:`convert_albert_original_tf_checkpoint_to_pytorch.py
<src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>` script.
70

Sylvain Gugger's avatar
Sylvain Gugger committed
71
72
73
The CLI takes as input a TensorFlow checkpoint (three files starting with ``model.ckpt-best``\ ) and the accompanying
configuration file (\ ``albert_config.json``\ ), then creates and saves a PyTorch model. To run this conversion you
will need to have TensorFlow and PyTorch installed.
74
75
76
77
78
79
80
81
82
83
84
85

Here is an example of the conversion process for the pre-trained ``ALBERT Base`` model:

.. code-block:: shell

   export ALBERT_BASE_DIR=/path/to/albert/albert_base

   transformers-cli convert --model_type albert \
     --tf_checkpoint $ALBERT_BASE_DIR/model.ckpt-best \
     --config $ALBERT_BASE_DIR/albert_config.json \
     --pytorch_dump_output $ALBERT_BASE_DIR/pytorch_model.bin

Sylvain Gugger's avatar
Sylvain Gugger committed
86
87
You can download Google's pre-trained models for the conversion `here
<https://github.com/google-research/albert#pre-trained-models>`__.
88

89
OpenAI GPT
Sylvain Gugger's avatar
Sylvain Gugger committed
90
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91

Sylvain Gugger's avatar
Sylvain Gugger committed
92
93
94
Here is an example of the conversion process for a pre-trained OpenAI GPT model, assuming that your NumPy checkpoint
save as the same format than OpenAI pretrained model (see `here <https://github.com/openai/finetune-transformer-lm>`__\
)
95
96
97
98
99

.. code-block:: shell

   export OPENAI_GPT_CHECKPOINT_FOLDER_PATH=/path/to/openai/pretrained/numpy/weights

100
101
102
103
104
105
   transformers-cli convert --model_type gpt \
     --tf_checkpoint $OPENAI_GPT_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT_CONFIG] \
     [--finetuning_task_name OPENAI_GPT_FINETUNED_TASK] \

106

thomwolf's avatar
thomwolf committed
107
OpenAI GPT-2
Sylvain Gugger's avatar
Sylvain Gugger committed
108
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
thomwolf's avatar
thomwolf committed
109

Sylvain Gugger's avatar
Sylvain Gugger committed
110
111
Here is an example of the conversion process for a pre-trained OpenAI GPT-2 model (see `here
<https://github.com/openai/gpt-2>`__\ )
thomwolf's avatar
thomwolf committed
112
113
114
115
116

.. code-block:: shell

   export OPENAI_GPT2_CHECKPOINT_PATH=/path/to/gpt2/pretrained/weights

117
118
119
120
121
   transformers-cli convert --model_type gpt2 \
     --tf_checkpoint $OPENAI_GPT2_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT2_CONFIG] \
     [--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK]
thomwolf's avatar
thomwolf committed
122

123
Transformer-XL
Sylvain Gugger's avatar
Sylvain Gugger committed
124
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
125

Sylvain Gugger's avatar
Sylvain Gugger committed
126
127
Here is an example of the conversion process for a pre-trained Transformer-XL model (see `here
<https://github.com/kimiyoung/transformer-xl/tree/master/tf#obtain-and-evaluate-pretrained-sota-models>`__\ )
128
129
130
131
132

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_FOLDER_PATH=/path/to/transfo/xl/checkpoint

133
134
135
136
137
   transformers-cli convert --model_type transfo_xl \
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config TRANSFO_XL_CONFIG] \
     [--finetuning_task_name TRANSFO_XL_FINETUNED_TASK]
138
139
140


XLNet
Sylvain Gugger's avatar
Sylvain Gugger committed
141
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
142

143
Here is an example of the conversion process for a pre-trained XLNet model:
144
145
146
147
148
149

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_PATH=/path/to/xlnet/checkpoint
   export TRANSFO_XL_CONFIG_PATH=/path/to/xlnet/config

150
151
152
153
154
   transformers-cli convert --model_type xlnet \
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_PATH \
     --config $TRANSFO_XL_CONFIG_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--finetuning_task_name XLNET_FINETUNED_TASK] \
thomwolf's avatar
thomwolf committed
155
156
157


XLM
Sylvain Gugger's avatar
Sylvain Gugger committed
158
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
thomwolf's avatar
thomwolf committed
159
160
161
162
163
164
165

Here is an example of the conversion process for a pre-trained XLM model:

.. code-block:: shell

   export XLM_CHECKPOINT_PATH=/path/to/xlm/checkpoint

166
167
168
169
   transformers-cli convert --model_type xlm \
     --tf_checkpoint $XLM_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT
    [--config XML_CONFIG] \
Sylvain Gugger's avatar
Sylvain Gugger committed
170
    [--finetuning_task_name XML_FINETUNED_TASK]
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185


T5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Here is an example of the conversion process for a pre-trained T5 model:

.. code-block:: shell

   export T5=/path/to/t5/uncased_L-12_H-768_A-12

   transformers-cli convert --model_type t5 \
     --tf_checkpoint $T5/t5_model.ckpt \
     --config $T5/t5_config.json \
     --pytorch_dump_output $T5/pytorch_model.bin