transforms.rst 7.77 KB
Newer Older
1
2
.. _transforms:

Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
3
4
torchvision.transforms
======================
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
5
6
7

.. currentmodule:: torchvision.transforms

8
Transforms are common image transformations. They can be chained together using :class:`Compose`.
9
10
11
Most transform classes have a function equivalent: :ref:`functional
transforms <functional_transforms>` give fine-grained control over the
transformations.
12
13
This is useful if you have to build a more complex transformation pipeline
(e.g. in the case of segmentation tasks).
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
14

15
16
17
18
19
20
21
22
23
24
25
26
Most transformations accept both `PIL <https://pillow.readthedocs.io>`_
images and tensor images, although some transformations are :ref:`PIL-only
<transforms_pil_only>` and some are :ref:`tensor-only
<transforms_tensor_only>`. The :ref:`conversion_transforms` may be used to
convert to and from PIL images.

The transformations that accept tensor images also accept batches of tensor
images. A Tensor Image is a tensor with ``(C, H, W)`` shape, where ``C`` is a
number of channels, ``H`` and ``W`` are image height and width. A batch of
Tensor Images is a tensor of ``(B, C, H, W)`` shape, where ``B`` is a number
of images in the batch.

27
The expected range of the values of a tensor image is implicitly defined by
28
29
30
31
32
33
34
35
36
the tensor dtype. Tensor images with a float dtype are expected to have
values in ``[0, 1)``. Tensor images with an integer dtype are expected to
have values in ``[0, MAX_DTYPE]`` where ``MAX_DTYPE`` is the largest value
that can be represented in that dtype.

Randomized transformations will apply the same transformation to all the
images of a given batch, but they will produce different transformations
across calls. For reproducible transformations across calls, you may use
:ref:`functional transforms <functional_transforms>`.
37

38
The following examples illustrate the use of the available transforms:
39
40
41
42
43
44
45
46
47
48
49
50
51

    * :ref:`sphx_glr_auto_examples_plot_transforms.py`

        .. figure:: ../source/auto_examples/images/sphx_glr_plot_transforms_001.png
            :align: center
            :scale: 65%

    * :ref:`sphx_glr_auto_examples_plot_scripted_tensor_transforms.py`

        .. figure:: ../source/auto_examples/images/sphx_glr_plot_scripted_tensor_transforms_001.png
            :align: center
            :scale: 30%

52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
.. warning::

    Since v0.8.0 all random transformations are using torch default random generator to sample random parameters.
    It is a backward compatibility breaking change and user should set the random state as following:

    .. code:: python

        # Previous versions
        # import random
        # random.seed(12)

        # Now
        import torch
        torch.manual_seed(17)

    Please, keep in mind that the same seed for torch random generator and Python random generator will not
    produce the same results.

70
71

Scriptable transforms
vfdev's avatar
vfdev committed
72
---------------------
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88

In order to script the transformations, please use ``torch.nn.Sequential`` instead of :class:`Compose`.

.. code:: python

    transforms = torch.nn.Sequential(
        transforms.CenterCrop(10),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
    )
    scripted_transforms = torch.jit.script(transforms)

Make sure to use only scriptable transformations, i.e. that work with ``torch.Tensor`` and does not require
`lambda` functions or ``PIL.Image``.

For any custom transformations to be used with ``torch.jit.script``, they should be derived from ``torch.nn.Module``.

89

vfdev's avatar
vfdev committed
90
91
Compositions of transforms
--------------------------
92

Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
93
94
.. autoclass:: Compose

95

vfdev's avatar
vfdev committed
96
97
Transforms on PIL Image and torch.\*Tensor
------------------------------------------
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
98

99
.. autoclass:: CenterCrop
vfdev's avatar
vfdev committed
100
    :members:
101

102
.. autoclass:: ColorJitter
vfdev's avatar
vfdev committed
103
    :members:
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
104

105
.. autoclass:: FiveCrop
vfdev's avatar
vfdev committed
106
    :members:
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
107

108
.. autoclass:: Grayscale
vfdev's avatar
vfdev committed
109
    :members:
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
110

111
.. autoclass:: Pad
vfdev's avatar
vfdev committed
112
    :members:
113

114
.. autoclass:: RandomAffine
vfdev's avatar
vfdev committed
115
    :members:
116

117
.. autoclass:: RandomApply
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
118

119
.. autoclass:: RandomCrop
vfdev's avatar
vfdev committed
120
    :members:
121

122
.. autoclass:: RandomGrayscale
vfdev's avatar
vfdev committed
123
    :members:
124

125
.. autoclass:: RandomHorizontalFlip
vfdev's avatar
vfdev committed
126
    :members:
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
127

128
.. autoclass:: RandomPerspective
vfdev's avatar
vfdev committed
129
    :members:
130

131
.. autoclass:: RandomResizedCrop
vfdev's avatar
vfdev committed
132
    :members:
133

134
.. autoclass:: RandomRotation
vfdev's avatar
vfdev committed
135
    :members:
136

137
.. autoclass:: RandomSizedCrop
vfdev's avatar
vfdev committed
138
    :members:
139
140

.. autoclass:: RandomVerticalFlip
vfdev's avatar
vfdev committed
141
    :members:
142
143

.. autoclass:: Resize
vfdev's avatar
vfdev committed
144
    :members:
145
146

.. autoclass:: Scale
vfdev's avatar
vfdev committed
147
    :members:
148
149

.. autoclass:: TenCrop
vfdev's avatar
vfdev committed
150
    :members:
151

152
.. autoclass:: GaussianBlur
vfdev's avatar
vfdev committed
153
    :members:
154

155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
.. autoclass:: RandomInvert
    :members:

.. autoclass:: RandomPosterize
    :members:

.. autoclass:: RandomSolarize
    :members:

.. autoclass:: RandomAdjustSharpness
    :members:

.. autoclass:: RandomAutocontrast
    :members:

.. autoclass:: RandomEqualize
    :members:

173
174
.. _transforms_pil_only:

vfdev's avatar
vfdev committed
175
Transforms on PIL Image only
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
176
177
----------------------------

vfdev's avatar
vfdev committed
178
179
180
181
.. autoclass:: RandomChoice

.. autoclass:: RandomOrder

182
.. _transforms_tensor_only:
vfdev's avatar
vfdev committed
183
184
185
186

Transforms on torch.\*Tensor only
---------------------------------

surgan12's avatar
surgan12 committed
187
.. autoclass:: LinearTransformation
vfdev's avatar
vfdev committed
188
    :members:
surgan12's avatar
surgan12 committed
189

Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
190
.. autoclass:: Normalize
vfdev's avatar
vfdev committed
191
    :members:
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
192

193
.. autoclass:: RandomErasing
vfdev's avatar
vfdev committed
194
195
196
197
    :members:

.. autoclass:: ConvertImageDtype

198
.. _conversion_transforms:
199

Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
200
201
202
Conversion Transforms
---------------------

203
.. autoclass:: ToPILImage
vfdev's avatar
vfdev committed
204
    :members:
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
205

206
.. autoclass:: ToTensor
vfdev's avatar
vfdev committed
207
208
    :members:

Nicolas Hug's avatar
Nicolas Hug committed
209
210
211
.. autoclass:: PILToTensor
    :members:

Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
212
213
214
215
216

Generic Transforms
------------------

.. autoclass:: Lambda
vfdev's avatar
vfdev committed
217
    :members:
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
218

219

220
221
Automatic Augmentation Transforms
---------------------------------
222
223
224
225
226
227
228
229
230
231
232
233
234

`AutoAugment <https://arxiv.org/pdf/1805.09501.pdf>`_ is a common Data Augmentation technique that can improve the accuracy of Image Classification models.
Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that
ImageNet policies provide significant improvements when applied to other datasets.
In TorchVision we implemented 3 policies learned on the following datasets: ImageNet, CIFAR10 and SVHN.
The new transform can be used standalone or mixed-and-matched with existing transforms:

.. autoclass:: AutoAugmentPolicy
    :members:

.. autoclass:: AutoAugment
    :members:

235
236
237
238
`RandAugment <https://arxiv.org/abs/1909.13719>`_ is a simple high-performing Data Augmentation technique which improves the accuracy of Image Classification models.

.. autoclass:: RandAugment
    :members:
239

240
241
242
243
244
`TrivialAugmentWide <https://arxiv.org/abs/2103.10158>`_ is a dataset-independent data-augmentation technique which improves the accuracy of Image Classification models.

.. autoclass:: TrivialAugmentWide
    :members:

245
246
.. _functional_transforms:

247
248
249
Functional Transforms
---------------------

250
251
252
Functional transforms give you fine-grained control of the transformation pipeline.
As opposed to the transformations above, functional transforms don't contain a random number
generator for their parameters.
253
254
That means you have to specify/generate all parameters, but the functional transform will give you
reproducible results across calls.
255
256
257

Example:
you can apply a functional transform with the same parameters to multiple images like this:
258
259
260
261
262
263
264

.. code:: python

    import torchvision.transforms.functional as TF
    import random

    def my_segmentation_transforms(image, segmentation):
265
        if random.random() > 0.5:
266
267
268
269
270
271
            angle = random.randint(-30, 30)
            image = TF.rotate(image, angle)
            segmentation = TF.rotate(segmentation, angle)
        # more transforms ...
        return image, segmentation

272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293

Example:
you can use a functional transform to build transform classes with custom behavior:

.. code:: python

    import torchvision.transforms.functional as TF
    import random

    class MyRotationTransform:
        """Rotate by one of the given angles."""

        def __init__(self, angles):
            self.angles = angles

        def __call__(self, x):
            angle = random.choice(self.angles)
            return TF.rotate(x, angle)

    rotation_transform = MyRotationTransform(angles=[-30, -15, 0, 15, 30])


294
295
.. automodule:: torchvision.transforms.functional
    :members: