pipeline_tutorial.mdx 7.39 KB
Newer Older
Steven Liu's avatar
Steven Liu committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Pipelines for inference

15
The [`pipeline`] makes it simple to use any model from the [Hub](https://huggingface.co/models) for inference on any language, computer vision, speech, and multimodal tasks. Even if you don't have experience with a specific modality or aren't familiar with the underlying code behind the models, you can still use them for inference with the [`pipeline`]! This tutorial will teach you to:
Steven Liu's avatar
Steven Liu committed
16
17
18

* Use a [`pipeline`] for inference.
* Use a specific tokenizer or model.
19
* Use a [`pipeline`] for audio, vision, and multimodal tasks.
Steven Liu's avatar
Steven Liu committed
20
21
22

<Tip>

23
Take a look at the [`pipeline`] documentation for a complete list of supported tasks and available parameters.
Steven Liu's avatar
Steven Liu committed
24
25
26
27
28

</Tip>

## Pipeline usage

29
While each task has an associated [`pipeline`], it is simpler to use the general [`pipeline`] abstraction which contains all the task-specific pipelines. The [`pipeline`] automatically loads a default model and a preprocessing class capable of inference for your task.
Steven Liu's avatar
Steven Liu committed
30
31
32
33
34
35
36
37
38
39
40
41

1. Start by creating a [`pipeline`] and specify an inference task:

```py
>>> from transformers import pipeline

>>> generator = pipeline(task="text-generation")
```

2. Pass your input text to the [`pipeline`]:

```py
42
43
44
>>> generator(
...     "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone"
... )  # doctest: +SKIP
Steven Liu's avatar
Steven Liu committed
45
46
47
48
49
50
51
52
53
54
55
[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Seven for the Iron-priests at the door to the east, and thirteen for the Lord Kings at the end of the mountain'}]
```

If you have more than one input, pass your input as a list:

```py
>>> generator(
...     [
...         "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone",
...         "Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne",
...     ]
56
... )  # doctest: +SKIP
Steven Liu's avatar
Steven Liu committed
57
58
59
60
61
62
63
64
```

Any additional parameters for your task can also be included in the [`pipeline`]. The `text-generation` task has a [`~generation_utils.GenerationMixin.generate`] method with several parameters for controlling the output. For example, if you want to generate more than one output, set the `num_return_sequences` parameter:

```py
>>> generator(
...     "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone",
...     num_return_sequences=2,
65
... )  # doctest: +SKIP
Steven Liu's avatar
Steven Liu committed
66
67
68
69
```

### Choose a model and tokenizer

70
The [`pipeline`] accepts any model from the [Hub](https://huggingface.co/models). There are tags on the Hub that allow you to filter for a model you'd like to use for your task. Once you've picked an appropriate model, load it with the corresponding `AutoModelFor` and [`AutoTokenizer`] class. For example, load the [`AutoModelForCausalLM`] class for a causal language modeling task:
Steven Liu's avatar
Steven Liu committed
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89

```py
>>> from transformers import AutoTokenizer, AutoModelForCausalLM

>>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
>>> model = AutoModelForCausalLM.from_pretrained("distilgpt2")
```

Create a [`pipeline`] for your task, and specify the model and tokenizer you've loaded:

```py
>>> from transformers import pipeline

>>> generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer)
```

Pass your input text to the [`pipeline`] to generate some text:

```py
90
91
92
>>> generator(
...     "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone"
... )  # doctest: +SKIP
Steven Liu's avatar
Steven Liu committed
93
94
95
96
97
[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Seven for the Dragon-lords (for them to rule in a world ruled by their rulers, and all who live within the realm'}]
```

## Audio pipeline

98
The [`pipeline`] also supports audio tasks like audio classification and automatic speech recognition.
Steven Liu's avatar
Steven Liu committed
99

100
101
102
103
104
105
106
107
108
109
110
111
For example, let's classify the emotion in this audio clip:

```py
>>> from datasets import load_dataset
>>> import torch

>>> torch.manual_seed(42)  # doctest: +IGNORE_RESULT
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
>>> audio_file = ds[0]["audio"]["path"]
```

Find an [audio classification](https://huggingface.co/models?pipeline_tag=audio-classification) model on the Model Hub for emotion recognition and load it in the [`pipeline`]:
Steven Liu's avatar
Steven Liu committed
112
113
114
115
116
117
118
119
120
121
122
123

```py
>>> from transformers import pipeline

>>> audio_classifier = pipeline(
...     task="audio-classification", model="ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition"
... )
```

Pass the audio file to the [`pipeline`]:

```py
124
125
126
127
>>> preds = audio_classifier(audio_file)
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> preds
[{'score': 0.1315, 'label': 'calm'}, {'score': 0.1307, 'label': 'neutral'}, {'score': 0.1274, 'label': 'sad'}, {'score': 0.1261, 'label': 'fearful'}, {'score': 0.1242, 'label': 'happy'}]
Steven Liu's avatar
Steven Liu committed
128
129
130
131
```

## Vision pipeline

132
Using a [`pipeline`] for vision tasks is practically identical.
Steven Liu's avatar
Steven Liu committed
133

134
Specify your task and pass your image to the classifier. The image can be a link or a local path to the image. For example, what species of cat is shown below?
Steven Liu's avatar
Steven Liu committed
135
136
137
138
139
140
141

![pipeline-cat-chonk](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg)

```py
>>> from transformers import pipeline

>>> vision_classifier = pipeline(task="image-classification")
142
>>> preds = vision_classifier(
Steven Liu's avatar
Steven Liu committed
143
144
...     images="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
... )
145
146
147
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> preds
[{'score': 0.4335, 'label': 'lynx, catamount'}, {'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}, {'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}, {'score': 0.0239, 'label': 'Egyptian cat'}, {'score': 0.0229, 'label': 'tiger cat'}]
Steven Liu's avatar
Steven Liu committed
148
```
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169

## Multimodal pipeline

The [`pipeline`] supports more than one modality. For example, a visual question answering (VQA) task combines text and image. Feel free to use any image link you like and a question you want to ask about the image. The image can be a URL or a local path to the image.

For example, if you use the same image from the vision pipeline above:

```py
>>> image = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
>>> question = "Where is the cat?"
```

Create a pipeline for `vqa` and pass it the image and question:

```py
>>> from transformers import pipeline

>>> vqa = pipeline(task="vqa")
>>> preds = vqa(image=image, question=question)
>>> preds = [{"score": round(pred["score"], 4), "answer": pred["answer"]} for pred in preds]
>>> preds
170
171
[{'score': 0.911, 'answer': 'snow'}, {'score': 0.8786, 'answer': 'in snow'}, {'score': 0.6714, 'answer': 'outside'}, {'score': 0.0293, 'answer': 'on ground'}, {'score': 0.0272, 'answer': 'ground'}]
```