README.md 6.48 KB
Newer Older
1
# Examples
LysandreJik's avatar
LysandreJik committed
2

3
Version 2.9 of 🤗 Transformers introduced a new [`Trainer`](https://github.com/huggingface/transformers/blob/master/src/transformers/trainer.py) class for PyTorch, and its equivalent [`TFTrainer`](https://github.com/huggingface/transformers/blob/master/src/transformers/trainer_tf.py) for TF 2.
Julien Plu's avatar
Julien Plu committed
4
Running the examples requires PyTorch 1.3.1+ or TensorFlow 2.2+.
Julien Chaumond's avatar
Julien Chaumond committed
5
6
7

Here is the list of all our examples:
- **grouped by task** (all official examples work for multiple models)
Sylvain Gugger's avatar
Sylvain Gugger committed
8
9
10
- with information on whether they are **built on top of `Trainer`/`TFTrainer`** (if not, they still work, they might
  just lack some features),
- whether or not they leverage the [🤗 Datasets](https://github.com/huggingface/datasets) library.
Julien Chaumond's avatar
Julien Chaumond committed
11
12
13
14
15
- links to **Colab notebooks** to walk through the scripts and run them easily,
- links to **Cloud deployments** to be able to deploy large-scale trainings in the Cloud with little to no setup.


## Important note
LysandreJik's avatar
LysandreJik committed
16

17
**Important**
18
19

To make sure you can successfully run the latest versions of the example scripts, you have to **install the library from source** and install some example-specific requirements.
thomwolf's avatar
thomwolf committed
20
Execute the following steps in a new virtual environment:
Rémi Louf's avatar
Rémi Louf committed
21
22

```bash
Julien Chaumond's avatar
Julien Chaumond committed
23
git clone https://github.com/huggingface/transformers
Rémi Louf's avatar
Rémi Louf committed
24
cd transformers
25
pip install .
thomwolf's avatar
thomwolf committed
26
pip install -r ./examples/requirements.txt
Rémi Louf's avatar
Rémi Louf committed
27
28
```

29
30
31
32
33
34
35
Alternatively, you can run the version of the examples as they were for your current version of Transformers via (for instance with v3.4.0):
```bash
git checkout tags/v3.4.0
```

## The Big Table of Tasks

Sylvain Gugger's avatar
Sylvain Gugger committed
36
37
38
39
40
41
42
43
44
45
46
47
48
| Task | Example datasets | Trainer support | TFTrainer support | 🤗 Datasets | Colab
|---|---|:---:|:---:|:---:|:---:|
| [**`language-modeling`**](https://github.com/huggingface/transformers/tree/master/examples/language-modeling)       | Raw text        | ✅ | -  | ✅ | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb)
| [**`text-classification`**](https://github.com/huggingface/transformers/tree/master/examples/text-classification)   | GLUE, XNLI      | ✅ | ✅ | ✅ | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/huggingface/notebooks/blob/master/examples/text_classification.ipynb)
| [**`token-classification`**](https://github.com/huggingface/transformers/tree/master/examples/token-classification) | CoNLL NER       | ✅ | ✅ | - | -
| [**`multiple-choice`**](https://github.com/huggingface/transformers/tree/master/examples/multiple-choice)           | SWAG, RACE, ARC | ✅ | ✅ | - | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ViktorAlm/notebooks/blob/master/MPC_GPU_Demo_for_TF_and_PT.ipynb)
| [**`question-answering`**](https://github.com/huggingface/transformers/tree/master/examples/question-answering)     | SQuAD           | ✅ | ✅ | - | -
| [**`text-generation`**](https://github.com/huggingface/transformers/tree/master/examples/text-generation)           | -               | n/a | n/a | - | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/02_how_to_generate.ipynb)
| [**`distillation`**](https://github.com/huggingface/transformers/tree/master/examples/distillation)                 | All             | - | -  | - | -
| [**`summarization`**](https://github.com/huggingface/transformers/tree/master/examples/seq2seq)                     | CNN/Daily Mail  | ✅  | - | - | -
| [**`translation`**](https://github.com/huggingface/transformers/tree/master/examples/seq2seq)                       | WMT             | ✅  | - | - | -
| [**`bertology`**](https://github.com/huggingface/transformers/tree/master/examples/bertology)                       | -               | - | - | - | -
| [**`adversarial`**](https://github.com/huggingface/transformers/tree/master/examples/adversarial)                   | HANS            | ✅ | - | - | -
49
50
51
52


<br>

Julien Chaumond's avatar
Julien Chaumond committed
53
54
## One-click Deploy to Cloud (wip)

55
**Coming soon!**
Julien Chaumond's avatar
Julien Chaumond committed
56

Julien Chaumond's avatar
Julien Chaumond committed
57
## Running on TPUs
LysandreJik's avatar
LysandreJik committed
58

59
60
61
62
63
When using Tensorflow, TPUs are supported out of the box as a `tf.distribute.Strategy`.

When using PyTorch, we support TPUs thanks to `pytorch/xla`. For more context and information on how to setup your TPU environment refer to Google's documentation and to the
very detailed [pytorch/xla README](https://github.com/pytorch/xla/blob/master/README.md).

64
In this repo, we provide a very simple launcher script named [xla_spawn.py](https://github.com/huggingface/transformers/tree/master/examples/xla_spawn.py) that lets you run our example scripts on multiple TPU cores without any boilerplate.
65
66
Just pass a `--num_cores` flag to this script, then your regular training script with its arguments (this is similar to the `torch.distributed.launch` helper for torch.distributed). 
Note that this approach does not work for examples that use `pytorch-lightning`.
67
68
69
70
71

For example for `run_glue`:

```bash
python examples/xla_spawn.py --num_cores 8 \
72
	examples/text-classification/run_glue.py \
73
74
75
76
77
78
79
80
81
82
83
84
	--model_name_or_path bert-base-cased \
	--task_name mnli \
	--data_dir ./data/glue_data/MNLI \
	--output_dir ./models/tpu \
	--overwrite_output_dir \
	--do_train \
	--do_eval \
	--num_train_epochs 1 \
	--save_steps 20000
```

Feedback and more use cases and benchmarks involving TPUs are welcome, please share with the community.
85
86
87

## Logging & Experiment tracking

88
89
90
91
92
93
94
You can easily log and monitor your runs code. The following are currently supported:

* [TensorBoard](https://www.tensorflow.org/tensorboard)
* [Weights & Biases](https://docs.wandb.com/library/integrations/huggingface)
* [Comet ML](https://www.comet.ml/docs/python-sdk/huggingface/)

### Weights & Biases
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116

To use Weights & Biases, install the wandb package with:

```bash
pip install wandb
```

Then log in the command line:

```bash
wandb login
```

If you are in Jupyter or Colab, you should login with:

```python
import wandb
wandb.login()
```

Whenever you use `Trainer` or `TFTrainer` classes, your losses, evaluation metrics, model topology and gradients (for `Trainer` only) will automatically be logged.

117
When using 🤗 Transformers with PyTorch Lightning, runs can be tracked through `WandbLogger`. Refer to related [documentation & examples](https://docs.wandb.com/library/integrations/lightning).
118
119
120
121
122
123
124
125
126
127
128
129
130
131

### Comet.ml

To use `comet_ml`, install the Python package with:

```bash
pip install comet_ml
```

or if in a Conda environment:

```bash
conda install -c comet_ml -c anaconda -c conda-forge comet_ml
```