installation.md 8.32 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<!---
Copyright 2020 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

17
18
# Installation

19
馃 Transformers is tested on Python 3.6+, and PyTorch 1.1.0+ or TensorFlow 2.0+.
20

21
You should install 馃 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're
22
unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/). Create a virtual environment with the version of Python you're going
23
to use and activate it.
24

25
26
Now, if you want to use 馃 Transformers, you can install it with pip. If you'd like to play with the examples, you
must install it from source.
27

28
29
30
## Installation with pip

First you need to install one of, or both, TensorFlow 2.0 and PyTorch.
31
32
Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/pip#tensorflow-2.0-rc-is-available),
[PyTorch installation page](https://pytorch.org/get-started/locally/#start-locally) and/or
33
34
[Flax installation page](https://github.com/google/flax#quick-install)
regarding the specific install command for your platform.
35
36
37
38

When TensorFlow 2.0 and/or PyTorch has been installed, 馃 Transformers can be installed using pip as follows:

```bash
39
40
41
pip install transformers
```

42
Alternatively, for CPU-support only, you can install 馃 Transformers and PyTorch in one line with:
43
44
45
46
47

```bash
pip install transformers[torch]
```

48
or 馃 Transformers and TensorFlow 2.0 in one line with:
49
50
51
52
53

```bash
pip install transformers[tf-cpu]
```

54
55
56
57
58
59
or 馃 Transformers and Flax in one line with:

```bash
pip install transformers[flax]
```

60
61
62
To check 馃 Transformers is properly installed, run the following command:

```bash
63
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"
64
65
66
67
68
```

It should download a pretrained model then print something like

```bash
69
[{'label': 'POSITIVE', 'score': 0.9998704791069031}]
70
71
72
73
74
```

(Note that TensorFlow will print additional stuff before that last statement.)

## Installing from source
75

76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
Here is how to quickly install `transformers` from source:

```bash
pip install git+https://github.com/huggingface/transformers
```

Note that this will install not the latest released version, but the bleeding edge `master` version, which you may want to use in case a bug has been fixed since the last official release and a new release hasn't  been yet rolled out.

While we strive to keep `master` operational at all times, if you notice some issues, they usually get fixed within a few hours or a day and and you're more than welcome to help us detect any problems by opening an [Issue](https://github.com/huggingface/transformers/issues) and this way, things will get fixed even sooner.

Again, you can run:

```bash
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
```

to check 馃 Transformers is properly installed.

## Editable install

If you want to constantly use the bleeding edge `master` version of the source code, or if you want to contribute to the library and need to test the changes in the code you're making, you will need an editable install. This is done by cloning the repository and installing with the following commands:
97
98
99
100

``` bash
git clone https://github.com/huggingface/transformers.git
cd transformers
101
102
103
pip install -e .
```

104
105
106
107
108
109
110
111
112
This command performs a magical link between the folder you cloned the repository to and your python library paths, and it'll look inside this folder in addition to the normal library-wide paths. So if normally your python packages get installed into:
```
~/anaconda3/envs/main/lib/python3.7/site-packages/
```
now this editable install will reside where you clone the folder to, e.g. `~/transformers/` and python will search it too.

Do note that you have to keep that `transformers` folder around and not delete it to continue using the  `transfomers` library.

Now, let's get to the real benefit of this installation approach. Say, you saw some new feature has been just committed into `master`. If you have already performed all the steps above, to update your transformers to include all the latest commits, all you need to do is to `cd` into that cloned repository folder and update the clone to the latest version:
113

114
115
116
```
cd ~/transformers/
git pull
117
118
```

119
There is nothing else to do. Your python environment will find the bleeding edge version of `transformers` on the next run.
120

121
122
123
124
125
126
127
128
129
130
131

## With conda

Since Transformers version v4.0.0, we now have a conda channel: `huggingface`.

馃 Transformers can be installed using conda as follows:

```
conda install -c huggingface transformers
```

132
Follow the installation pages of TensorFlow, PyTorch or Flax to see how to install them with conda.
133

134
135
136
## Caching models

This library provides pretrained models that will be downloaded and cached locally. Unless you specify a location with
137
`cache_dir=...` when you use methods like `from_pretrained`, these models will automatically be downloaded in the
138
139
folder given by the shell environment variable ``TRANSFORMERS_CACHE``. The default value for it will be the Hugging
Face cache home followed by ``/transformers/``. This is (by order of priority):
140

141
  * shell environment variable ``HF_HOME``
142
143
  * shell environment variable ``XDG_CACHE_HOME`` + ``/huggingface/``
  * default: ``~/.cache/huggingface/``
144

145
So if you don't have any specific environment variable set, the cache directory will be at
146
``~/.cache/huggingface/transformers/``.
147

148
**Note:** If you have set a shell environment variable for one of the predecessors of this library
149
(``PYTORCH_TRANSFORMERS_CACHE`` or ``PYTORCH_PRETRAINED_BERT_CACHE``), those will be used if there is no shell
150
environment variable for ``TRANSFORMERS_CACHE``.
151

152
### Note on model downloads (Continuous Integration or large-scale deployments)
153

Julien Chaumond's avatar
Julien Chaumond committed
154
If you expect to be downloading large volumes of models (more than 10,000) from huggingface.co (for instance through
155
your CI setup, or a large-scale production deployment), please cache the model files on your end. It will be way
Julien Chaumond's avatar
Julien Chaumond committed
156
faster, and cheaper. Feel free to contact us privately, we'd love to help with this.
157

158
159
160
161
162
163
164
165
166
167
168
169
170
### Offline mode

It's possible to run 馃 Transformers in a firewalled or a no-network environment.

Setting environment variable `TRANSFORMERS_OFFLINE=1` will tell 馃 Transformers to use local files only and will not try to look things up.

Most likely you may want to couple this with `HF_DATASETS_OFFLINE=1` that performs the same for 馃 Datasets if you're using the latter.

Here is an example of how this can be used on a filesystem that is shared between a normally networked and a firewalled to the external world instances.

On the instance with the normal network run your program which will download and cache models (and optionally datasets if you use 馃 Datasets). For example:

```
171
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
172
173
174
175
176
```

and then with the same filesystem you can now run the same program on a firewalled instance:
```
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
177
python examples/seq2seq/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
178
179
180
181
182
```
and it should succeed without any hanging waiting to timeout.



183
184
185
186
## Do you want to run a Transformer model on a mobile device?

You should check out our [swift-coreml-transformers](https://github.com/huggingface/swift-coreml-transformers) repo.

187
It contains a set of tools to convert PyTorch or TensorFlow 2.0 trained Transformer models (currently contains `GPT-2`,
188
`DistilGPT-2`, `BERT`, and `DistilBERT`) to CoreML models that run on iOS devices.
189

190
At some point in the future, you'll be able to seamlessly move from pretraining or fine-tuning models in PyTorch or
191
192
TensorFlow 2.0 to productizing them in CoreML, or prototype a model or an app in CoreML then research its
hyperparameters or architecture from PyTorch or TensorFlow 2.0. Super exciting!