".github/vscode:/vscode.git/clone" did not exist on "ea4c284a48fc80f978cef63dc002be6909ed152c"
converting_tensorflow_models.rst 7.72 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
.. 
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

thomwolf's avatar
thomwolf committed
13
Converting Tensorflow Checkpoints
Sylvain Gugger's avatar
Sylvain Gugger committed
14
=======================================================================================================================
15

Sylvain Gugger's avatar
Sylvain Gugger committed
16
17
A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models
than be loaded using the ``from_pretrained`` methods of the library.
18

19
.. note::
Sylvain Gugger's avatar
Sylvain Gugger committed
20
21
    Since 2.3.0 the conversion script is now part of the transformers CLI (**transformers-cli**) available in any
    transformers >= 2.3.0 installation.
22
23
24

    The documentation below reflects the **transformers-cli convert** command format.

25
BERT
Sylvain Gugger's avatar
Sylvain Gugger committed
26
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27

Sylvain Gugger's avatar
Sylvain Gugger committed
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
You can convert any TensorFlow checkpoint for BERT (in particular `the pre-trained models released by Google
<https://github.com/google-research/bert#pre-trained-models>`_\ ) in a PyTorch save file by using the
`convert_bert_original_tf_checkpoint_to_pytorch.py
<https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_
script.

This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated
configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights
from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that
can be imported using ``torch.load()`` (see examples in `run_bert_extract_features.py
<https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_extract_features.py>`_\ ,
`run_bert_classifier.py
<https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_classifier.py>`_ and
`run_bert_squad.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_squad.py>`_\
).

You only need to run this conversion script **once** to get a PyTorch model. You can then disregard the TensorFlow
checkpoint (the three files starting with ``bert_model.ckpt``\ ) but be sure to keep the configuration file (\
``bert_config.json``\ ) and the vocabulary file (\ ``vocab.txt``\ ) as these are needed for the PyTorch model too.

To run this specific conversion script you will need to have TensorFlow and PyTorch installed (\ ``pip install
tensorflow``\ ). The rest of the repository only requires PyTorch.
50
51
52
53
54
55
56

Here is an example of the conversion process for a pre-trained ``BERT-Base Uncased`` model:

.. code-block:: shell

   export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12

57
58
59
60
   transformers-cli convert --model_type bert \
     --tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
     --config $BERT_BASE_DIR/bert_config.json \
     --pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin
61

Sylvain Gugger's avatar
Sylvain Gugger committed
62
63
You can download Google's pre-trained models for the conversion `here
<https://github.com/google-research/bert#pre-trained-models>`__.
64

65
ALBERT
Sylvain Gugger's avatar
Sylvain Gugger committed
66
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
67

Sylvain Gugger's avatar
Sylvain Gugger committed
68
69
70
71
Convert TensorFlow model checkpoints of ALBERT to PyTorch using the
`convert_albert_original_tf_checkpoint_to_pytorch.py
<https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_
script.
72

Sylvain Gugger's avatar
Sylvain Gugger committed
73
74
75
The CLI takes as input a TensorFlow checkpoint (three files starting with ``model.ckpt-best``\ ) and the accompanying
configuration file (\ ``albert_config.json``\ ), then creates and saves a PyTorch model. To run this conversion you
will need to have TensorFlow and PyTorch installed.
76
77
78
79
80
81
82
83
84
85
86
87

Here is an example of the conversion process for the pre-trained ``ALBERT Base`` model:

.. code-block:: shell

   export ALBERT_BASE_DIR=/path/to/albert/albert_base

   transformers-cli convert --model_type albert \
     --tf_checkpoint $ALBERT_BASE_DIR/model.ckpt-best \
     --config $ALBERT_BASE_DIR/albert_config.json \
     --pytorch_dump_output $ALBERT_BASE_DIR/pytorch_model.bin

Sylvain Gugger's avatar
Sylvain Gugger committed
88
89
You can download Google's pre-trained models for the conversion `here
<https://github.com/google-research/albert#pre-trained-models>`__.
90

91
OpenAI GPT
Sylvain Gugger's avatar
Sylvain Gugger committed
92
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
93

Sylvain Gugger's avatar
Sylvain Gugger committed
94
95
96
Here is an example of the conversion process for a pre-trained OpenAI GPT model, assuming that your NumPy checkpoint
save as the same format than OpenAI pretrained model (see `here <https://github.com/openai/finetune-transformer-lm>`__\
)
97
98
99
100
101

.. code-block:: shell

   export OPENAI_GPT_CHECKPOINT_FOLDER_PATH=/path/to/openai/pretrained/numpy/weights

102
103
104
105
106
107
   transformers-cli convert --model_type gpt \
     --tf_checkpoint $OPENAI_GPT_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT_CONFIG] \
     [--finetuning_task_name OPENAI_GPT_FINETUNED_TASK] \

108

thomwolf's avatar
thomwolf committed
109
OpenAI GPT-2
Sylvain Gugger's avatar
Sylvain Gugger committed
110
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
thomwolf's avatar
thomwolf committed
111

Sylvain Gugger's avatar
Sylvain Gugger committed
112
113
Here is an example of the conversion process for a pre-trained OpenAI GPT-2 model (see `here
<https://github.com/openai/gpt-2>`__\ )
thomwolf's avatar
thomwolf committed
114
115
116
117
118

.. code-block:: shell

   export OPENAI_GPT2_CHECKPOINT_PATH=/path/to/gpt2/pretrained/weights

119
120
121
122
123
   transformers-cli convert --model_type gpt2 \
     --tf_checkpoint $OPENAI_GPT2_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT2_CONFIG] \
     [--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK]
thomwolf's avatar
thomwolf committed
124

125
Transformer-XL
Sylvain Gugger's avatar
Sylvain Gugger committed
126
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
127

Sylvain Gugger's avatar
Sylvain Gugger committed
128
129
Here is an example of the conversion process for a pre-trained Transformer-XL model (see `here
<https://github.com/kimiyoung/transformer-xl/tree/master/tf#obtain-and-evaluate-pretrained-sota-models>`__\ )
130
131
132
133
134

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_FOLDER_PATH=/path/to/transfo/xl/checkpoint

135
136
137
138
139
   transformers-cli convert --model_type transfo_xl \
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config TRANSFO_XL_CONFIG] \
     [--finetuning_task_name TRANSFO_XL_FINETUNED_TASK]
140
141
142


XLNet
Sylvain Gugger's avatar
Sylvain Gugger committed
143
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
144

145
Here is an example of the conversion process for a pre-trained XLNet model:
146
147
148
149
150
151

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_PATH=/path/to/xlnet/checkpoint
   export TRANSFO_XL_CONFIG_PATH=/path/to/xlnet/config

152
153
154
155
156
   transformers-cli convert --model_type xlnet \
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_PATH \
     --config $TRANSFO_XL_CONFIG_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--finetuning_task_name XLNET_FINETUNED_TASK] \
thomwolf's avatar
thomwolf committed
157
158
159


XLM
Sylvain Gugger's avatar
Sylvain Gugger committed
160
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
thomwolf's avatar
thomwolf committed
161
162
163
164
165
166
167

Here is an example of the conversion process for a pre-trained XLM model:

.. code-block:: shell

   export XLM_CHECKPOINT_PATH=/path/to/xlm/checkpoint

168
169
170
171
   transformers-cli convert --model_type xlm \
     --tf_checkpoint $XLM_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT
    [--config XML_CONFIG] \
Sylvain Gugger's avatar
Sylvain Gugger committed
172
    [--finetuning_task_name XML_FINETUNED_TASK]