".github/vscode:/vscode.git/clone" did not exist on "e7bee85df8ab3e7bf4a2bd5bd78592c811e94a34"
converting_tensorflow_models.rst 7.12 KB
Newer Older
thomwolf's avatar
thomwolf committed
1
Converting Tensorflow Checkpoints
Sylvain Gugger's avatar
Sylvain Gugger committed
2
=======================================================================================================================
3

thomwolf's avatar
thomwolf committed
4
A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models than be loaded using the ``from_pretrained`` methods of the library.
5

6
7
8
9
10
11
.. note::
    Since 2.3.0 the conversion script is now part of the transformers CLI (**transformers-cli**)
    available in any transformers >= 2.3.0 installation.

    The documentation below reflects the **transformers-cli convert** command format.

12
BERT
Sylvain Gugger's avatar
Sylvain Gugger committed
13
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14

15
You can convert any TensorFlow checkpoint for BERT (in particular `the pre-trained models released by Google <https://github.com/google-research/bert#pre-trained-models>`_\ ) in a PyTorch save file by using the `convert_bert_original_tf_checkpoint_to_pytorch.py <https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_ script.
16

17
This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that can be imported using ``torch.load()`` (see examples in `run_bert_extract_features.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_extract_features.py>`_\ , `run_bert_classifier.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_classifier.py>`_ and `run_bert_squad.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_squad.py>`_\ ).
18
19
20
21
22
23
24
25
26
27
28

You only need to run this conversion script **once** to get a PyTorch model. You can then disregard the TensorFlow checkpoint (the three files starting with ``bert_model.ckpt``\ ) but be sure to keep the configuration file (\ ``bert_config.json``\ ) and the vocabulary file (\ ``vocab.txt``\ ) as these are needed for the PyTorch model too.

To run this specific conversion script you will need to have TensorFlow and PyTorch installed (\ ``pip install tensorflow``\ ). The rest of the repository only requires PyTorch.

Here is an example of the conversion process for a pre-trained ``BERT-Base Uncased`` model:

.. code-block:: shell

   export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12

29
30
31
32
   transformers-cli convert --model_type bert \
     --tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
     --config $BERT_BASE_DIR/bert_config.json \
     --pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin
33
34
35

You can download Google's pre-trained models for the conversion `here <https://github.com/google-research/bert#pre-trained-models>`__.

36
ALBERT
Sylvain Gugger's avatar
Sylvain Gugger committed
37
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

Convert TensorFlow model checkpoints of ALBERT to PyTorch using the `convert_albert_original_tf_checkpoint_to_pytorch.py <https://github.com/huggingface/transformers/blob/master/src/transformers/convert_bert_original_tf_checkpoint_to_pytorch.py>`_ script.

The CLI takes as input a TensorFlow checkpoint (three files starting with ``model.ckpt-best``\ ) and the accompanying configuration file (\ ``albert_config.json``\ ), then creates and saves a PyTorch model. To run this conversion you will need to have TensorFlow and PyTorch installed.

Here is an example of the conversion process for the pre-trained ``ALBERT Base`` model:

.. code-block:: shell

   export ALBERT_BASE_DIR=/path/to/albert/albert_base

   transformers-cli convert --model_type albert \
     --tf_checkpoint $ALBERT_BASE_DIR/model.ckpt-best \
     --config $ALBERT_BASE_DIR/albert_config.json \
     --pytorch_dump_output $ALBERT_BASE_DIR/pytorch_model.bin

You can download Google's pre-trained models for the conversion `here <https://github.com/google-research/albert#pre-trained-models>`__.

56
OpenAI GPT
Sylvain Gugger's avatar
Sylvain Gugger committed
57
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
58
59
60
61
62
63
64

Here is an example of the conversion process for a pre-trained OpenAI GPT model, assuming that your NumPy checkpoint save as the same format than OpenAI pretrained model (see `here <https://github.com/openai/finetune-transformer-lm>`__\ )

.. code-block:: shell

   export OPENAI_GPT_CHECKPOINT_FOLDER_PATH=/path/to/openai/pretrained/numpy/weights

65
66
67
68
69
70
   transformers-cli convert --model_type gpt \
     --tf_checkpoint $OPENAI_GPT_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT_CONFIG] \
     [--finetuning_task_name OPENAI_GPT_FINETUNED_TASK] \

71

thomwolf's avatar
thomwolf committed
72
OpenAI GPT-2
Sylvain Gugger's avatar
Sylvain Gugger committed
73
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
thomwolf's avatar
thomwolf committed
74
75
76
77
78
79
80

Here is an example of the conversion process for a pre-trained OpenAI GPT-2 model (see `here <https://github.com/openai/gpt-2>`__\ )

.. code-block:: shell

   export OPENAI_GPT2_CHECKPOINT_PATH=/path/to/gpt2/pretrained/weights

81
82
83
84
85
   transformers-cli convert --model_type gpt2 \
     --tf_checkpoint $OPENAI_GPT2_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT2_CONFIG] \
     [--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK]
thomwolf's avatar
thomwolf committed
86

87
Transformer-XL
Sylvain Gugger's avatar
Sylvain Gugger committed
88
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
89
90
91
92
93
94
95

Here is an example of the conversion process for a pre-trained Transformer-XL model (see `here <https://github.com/kimiyoung/transformer-xl/tree/master/tf#obtain-and-evaluate-pretrained-sota-models>`__\ )

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_FOLDER_PATH=/path/to/transfo/xl/checkpoint

96
97
98
99
100
   transformers-cli convert --model_type transfo_xl \
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config TRANSFO_XL_CONFIG] \
     [--finetuning_task_name TRANSFO_XL_FINETUNED_TASK]
101
102
103


XLNet
Sylvain Gugger's avatar
Sylvain Gugger committed
104
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
105

106
Here is an example of the conversion process for a pre-trained XLNet model:
107
108
109
110
111
112

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_PATH=/path/to/xlnet/checkpoint
   export TRANSFO_XL_CONFIG_PATH=/path/to/xlnet/config

113
114
115
116
117
   transformers-cli convert --model_type xlnet \
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_PATH \
     --config $TRANSFO_XL_CONFIG_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--finetuning_task_name XLNET_FINETUNED_TASK] \
thomwolf's avatar
thomwolf committed
118
119
120


XLM
Sylvain Gugger's avatar
Sylvain Gugger committed
121
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
thomwolf's avatar
thomwolf committed
122
123
124
125
126
127
128

Here is an example of the conversion process for a pre-trained XLM model:

.. code-block:: shell

   export XLM_CHECKPOINT_PATH=/path/to/xlm/checkpoint

129
130
131
132
133
   transformers-cli convert --model_type xlm \
     --tf_checkpoint $XLM_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT
    [--config XML_CONFIG] \
    [--finetuning_task_name XML_FINETUNED_TASK]