pipelines.rst 16.4 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
.. 
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

Lysandre Debut's avatar
Lysandre Debut committed
13
Pipelines
Sylvain Gugger's avatar
Sylvain Gugger committed
14
-----------------------------------------------------------------------------------------------------------------------
Lysandre Debut's avatar
Lysandre Debut committed
15

Sylvain Gugger's avatar
Sylvain Gugger committed
16
17
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of
the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity
Sylvain Gugger's avatar
Sylvain Gugger committed
18
19
Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering. See the
:doc:`task summary <../task_summary>` for examples of use.
Lysandre Debut's avatar
Lysandre Debut committed
20
21
22

There are two categories of pipeline abstractions to be aware about:

Sylvain Gugger's avatar
Sylvain Gugger committed
23
24
25
- The :func:`~transformers.pipeline` which is the most powerful object encapsulating all other pipelines.
- The other task-specific pipelines:

26
    - :class:`~transformers.AudioClassificationPipeline`
27
    - :class:`~transformers.AutomaticSpeechRecognitionPipeline`
Sylvain Gugger's avatar
Sylvain Gugger committed
28
29
30
    - :class:`~transformers.ConversationalPipeline`
    - :class:`~transformers.FeatureExtractionPipeline`
    - :class:`~transformers.FillMaskPipeline`
Sylvain Gugger's avatar
Sylvain Gugger committed
31
    - :class:`~transformers.ImageClassificationPipeline`
32
    - :class:`~transformers.ImageSegmentationPipeline`
33
    - :class:`~transformers.ObjectDetectionPipeline`
Sylvain Gugger's avatar
Sylvain Gugger committed
34
35
    - :class:`~transformers.QuestionAnsweringPipeline`
    - :class:`~transformers.SummarizationPipeline`
36
    - :class:`~transformers.TableQuestionAnsweringPipeline`
Sylvain Gugger's avatar
Sylvain Gugger committed
37
38
    - :class:`~transformers.TextClassificationPipeline`
    - :class:`~transformers.TextGenerationPipeline`
39
    - :class:`~transformers.Text2TextGenerationPipeline`
Sylvain Gugger's avatar
Sylvain Gugger committed
40
41
    - :class:`~transformers.TokenClassificationPipeline`
    - :class:`~transformers.TranslationPipeline`
42
    - :class:`~transformers.ZeroShotClassificationPipeline`
Lysandre Debut's avatar
Lysandre Debut committed
43
44

The pipeline abstraction
Sylvain Gugger's avatar
Sylvain Gugger committed
45
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre Debut's avatar
Lysandre Debut committed
46

Sylvain Gugger's avatar
Sylvain Gugger committed
47
48
The `pipeline` abstraction is a wrapper around all the other available pipelines. It is instantiated as any other
pipeline but requires an additional argument which is the `task`.
Lysandre Debut's avatar
Lysandre Debut committed
49

50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
Simple call on one item:

.. code-block::

    >>> pipe = pipeline("text-classification")
    >>> pipe("This restaurant is awesome")
    [{'label': 'POSITIVE', 'score': 0.9998743534088135}]

To call a pipeline on many items, you can either call with a `list`.

.. code-block::

    >>> pipe = pipeline("text-classification")
    >>> pipe(["This restaurant is awesome", "This restaurant is aweful"])
    [{'label': 'POSITIVE', 'score': 0.9998743534088135},
     {'label': 'NEGATIVE', 'score': 0.9996669292449951}]


To iterate of full datasets it is recommended to use a :obj:`dataset` directly. This means you don't need to allocate
the whole dataset at once, nor do you need to do batching yourself. This should work just as fast as custom loops on
GPU. If it doesn't don't hesitate to create an issue.

.. code-block::

74
75
76
77
78
    import datasets
    from transformers import pipeline
    from transformers.pipelines.base import KeyDataset
    import tqdm

79
80
81
82
83
84
85
86
87
88
89
90
    pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
    dataset = datasets.load_dataset("superb", name="asr", split="test")

    # KeyDataset (only `pt`) will simply return the item in the dict returned by the dataset item
    # as we're not interested in the `target` part of the dataset.
    for out in tqdm.tqdm(pipe(KeyDataset(dataset, "file"))):
        print(out)
        # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
        # {"text": ....}
        # ....


Sylvain Gugger's avatar
Sylvain Gugger committed
91
.. autofunction:: transformers.pipeline
Lysandre Debut's avatar
Lysandre Debut committed
92

93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
Pipeline batching
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All pipelines (except `zero-shot-classification` and `question-answering` currently) can use batching. This will work
whenever the pipeline uses its streaming ability (so when passing lists or :obj:`Dataset`).

.. code-block::

    from transformers import pipeline                                                   
    from transformers.pipelines.base import KeyDataset
    import datasets
    import tqdm                                                                         

    dataset = datasets.load_dataset("imdb", name="plain_text", split="unsupervised")
    pipe = pipeline("text-classification", device=0)
    for out in pipe(KeyDataset(dataset, "text"), batch_size=8, truncation="only_first"):
        print(out)
        # [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
        # Exactly the same output as before, but the content are passed
        # as batches to the model


.. warning::

    However, this is not automatically a win for performance. It can be either a 10x speedup or 5x slowdown depending
    on hardware, data and the actual model being used.

    Example where it's most a speedup:


.. code-block::

    from transformers import pipeline                                                   
    from torch.utils.data import Dataset                                                
    import tqdm                                                                         


    pipe = pipeline("text-classification", device=0)                                    


    class MyDataset(Dataset):                                                           
        def __len__(self):                                                              
            return 5000                                                                 

        def __getitem__(self, i):                                                       
            return "This is a test"                                                     


    dataset = MyDataset()   

    for batch_size in [1, 8, 64, 256]:
        print("-" * 30)                                                                     
        print(f"Streaming batch_size={batch_size}")    
        for out in tqdm.tqdm(pipe(dataset, batch_size=batch_size), total=len(dataset)):              
            pass


.. code-block::

    # On GTX 970
    ------------------------------
    Streaming no batching
    100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 5000/5000 [00:26<00:00, 187.52it/s]
    ------------------------------
    Streaming batch_size=8
    100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻坾 5000/5000 [00:04<00:00, 1205.95it/s]
    ------------------------------
    Streaming batch_size=64
    100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻坾 5000/5000 [00:02<00:00, 2478.24it/s]
    ------------------------------
    Streaming batch_size=256
    100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻坾 5000/5000 [00:01<00:00, 2554.43it/s]
    (diminishing returns, saturated the GPU)


Example where it's most a slowdown:

.. code-block::

    class MyDataset(Dataset):                                                           
        def __len__(self):                                                              
            return 5000                                                                 

        def __getitem__(self, i):                                                       
            if i % 64 == 0:                                                          
                n = 100                                                              
            else:                                                                    
                n = 1                                                                
            return "This is a test" * n

This is a occasional very long sentence compared to the other. In that case, the **whole** batch will need to be 400
tokens long, so the whole batch will be [64, 400] instead of [64, 4], leading to the high slowdown. Even worse, on
bigger batches, the program simply crashes.


.. code-block::

    ------------------------------
    Streaming no batching
    100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻坾 1000/1000 [00:05<00:00, 183.69it/s]
    ------------------------------
    Streaming batch_size=8
    100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻坾 1000/1000 [00:03<00:00, 265.74it/s]
    ------------------------------
    Streaming batch_size=64
    100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 1000/1000 [00:26<00:00, 37.80it/s]
    ------------------------------
    Streaming batch_size=256
      0%|                                                                                 | 0/1000 [00:00<?, ?it/s]
    Traceback (most recent call last):
      File "/home/nicolas/src/transformers/test.py", line 42, in <module>
        for out in tqdm.tqdm(pipe(dataset, batch_size=256), total=len(dataset)):
    ....
        q = q / math.sqrt(dim_per_head)  # (bs, n_heads, q_length, dim_per_head)
    RuntimeError: CUDA out of memory. Tried to allocate 376.00 MiB (GPU 0; 3.95 GiB total capacity; 1.72 GiB already allocated; 354.88 MiB free; 2.46 GiB reserved in total by PyTorch)


There are no good (general) solutions for this problem, and your mileage may vary depending on your use cases. Rule of
thumb:

For users, a rule of thumb is:

- **Measure performance on your load, with your hardware. Measure, measure, and keep measuring. Real numbers are the
  only way to go.**
- If you are latency constrained (live product doing inference), don't batch
- If you are using CPU, don't batch.
- If you are using throughput (you want to run your model on a bunch of static data), on GPU, then:

      - If you have no clue about the size of the sequence_length ("natural" data), by default don't batch, measure and
        try tentatively to add it, add OOM checks to recover when it will fail (and it will at some point if you don't
        control the sequence_length.)
      - If your sequence_length is super regular, then batching is more likely to be VERY interesting, measure and push
        it until you get OOMs.
      - The larger the GPU the more likely batching is going to be more interesting
- As soon as you enable batching, make sure you can handle OOMs nicely.



231
232
233
234
Implementing a pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:doc:`Implementing a new pipeline <../add_new_pipeline>`
Lysandre Debut's avatar
Lysandre Debut committed
235
236

The task specific pipelines
Sylvain Gugger's avatar
Sylvain Gugger committed
237
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lysandre Debut's avatar
Lysandre Debut committed
238

239

240
241
242
243
244
245
246
AudioClassificationPipeline
=======================================================================================================================

.. autoclass:: transformers.AudioClassificationPipeline
    :special-members: __call__
    :members:

247
248
249
250
251
252
253
AutomaticSpeechRecognitionPipeline
=======================================================================================================================

.. autoclass:: transformers.AutomaticSpeechRecognitionPipeline
    :special-members: __call__
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
254
ConversationalPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
255
=======================================================================================================================
Lysandre Debut's avatar
Lysandre Debut committed
256

Sylvain Gugger's avatar
Sylvain Gugger committed
257
.. autoclass:: transformers.Conversation
Lysandre Debut's avatar
Lysandre Debut committed
258

Sylvain Gugger's avatar
Sylvain Gugger committed
259
260
261
262
263
.. autoclass:: transformers.ConversationalPipeline
    :special-members: __call__
    :members:

FeatureExtractionPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
264
=======================================================================================================================
Lysandre Debut's avatar
Lysandre Debut committed
265

Sylvain Gugger's avatar
Sylvain Gugger committed
266
267
268
.. autoclass:: transformers.FeatureExtractionPipeline
    :special-members: __call__
    :members:
Lysandre Debut's avatar
Lysandre Debut committed
269
270

FillMaskPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
271
=======================================================================================================================
Lysandre Debut's avatar
Lysandre Debut committed
272
273

.. autoclass:: transformers.FillMaskPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
274
275
    :special-members: __call__
    :members:
Lysandre Debut's avatar
Lysandre Debut committed
276

277
278
279
280
281
282
283
ImageClassificationPipeline
=======================================================================================================================

.. autoclass:: transformers.ImageClassificationPipeline
    :special-members: __call__
    :members:

284
285
286
287
288
289
290
ImageSegmentationPipeline
=======================================================================================================================

.. autoclass:: transformers.ImageSegmentationPipeline
    :special-members: __call__
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
291
NerPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
292
=======================================================================================================================
Lysandre Debut's avatar
Lysandre Debut committed
293

294
295
296
.. autoclass:: transformers.NerPipeline

See :class:`~transformers.TokenClassificationPipeline` for all details.
Lysandre Debut's avatar
Lysandre Debut committed
297

298
299
300
301
302
303
304
ObjectDetectionPipeline
=======================================================================================================================

.. autoclass:: transformers.ObjectDetectionPipeline
    :special-members: __call__
    :members:

Lysandre Debut's avatar
Lysandre Debut committed
305
QuestionAnsweringPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
306
=======================================================================================================================
Lysandre Debut's avatar
Lysandre Debut committed
307
308

.. autoclass:: transformers.QuestionAnsweringPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
309
310
    :special-members: __call__
    :members:
311
312

SummarizationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
313
=======================================================================================================================
314
315

.. autoclass:: transformers.SummarizationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
316
317
    :special-members: __call__
    :members:
318

319
320
321
322
323
324
325
TableQuestionAnsweringPipeline
=======================================================================================================================

.. autoclass:: transformers.TableQuestionAnsweringPipeline
    :special-members: __call__


Sylvain Gugger's avatar
Sylvain Gugger committed
326
TextClassificationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
327
=======================================================================================================================
Sylvain Gugger's avatar
Sylvain Gugger committed
328
329
330
331

.. autoclass:: transformers.TextClassificationPipeline
    :special-members: __call__
    :members:
332
333

TextGenerationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
334
=======================================================================================================================
335
336

.. autoclass:: transformers.TextGenerationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
337
338
    :special-members: __call__
    :members:
339

340
Text2TextGenerationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
341
=======================================================================================================================
342
343
344
345
346

.. autoclass:: transformers.Text2TextGenerationPipeline
    :special-members: __call__
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
347
TokenClassificationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
348
=======================================================================================================================
349

Sylvain Gugger's avatar
Sylvain Gugger committed
350
351
352
353
.. autoclass:: transformers.TokenClassificationPipeline
    :special-members: __call__
    :members:

354
355
356
357
358
359
360
TranslationPipeline
=======================================================================================================================

.. autoclass:: transformers.TranslationPipeline
    :special-members: __call__
    :members:

361
ZeroShotClassificationPipeline
Sylvain Gugger's avatar
Sylvain Gugger committed
362
=======================================================================================================================
363
364
365
366
367

.. autoclass:: transformers.ZeroShotClassificationPipeline
    :special-members: __call__
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
368
Parent class: :obj:`Pipeline`
Sylvain Gugger's avatar
Sylvain Gugger committed
369
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
370
371
372

.. autoclass:: transformers.Pipeline
    :members: