transforms.rst 7.62 KB
Newer Older
1
2
.. _transforms:

3
4
Transforming and augmenting images
==================================
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
5
6
7

.. currentmodule:: torchvision.transforms

8
9
10
Transforms are common image transformations available in the
``torchvision.transforms`` module. They can be chained together using
:class:`Compose`.
11
12
13
Most transform classes have a function equivalent: :ref:`functional
transforms <functional_transforms>` give fine-grained control over the
transformations.
14
15
This is useful if you have to build a more complex transformation pipeline
(e.g. in the case of segmentation tasks).
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
16

17
18
19
20
21
22
23
24
25
26
27
28
Most transformations accept both `PIL <https://pillow.readthedocs.io>`_
images and tensor images, although some transformations are :ref:`PIL-only
<transforms_pil_only>` and some are :ref:`tensor-only
<transforms_tensor_only>`. The :ref:`conversion_transforms` may be used to
convert to and from PIL images.

The transformations that accept tensor images also accept batches of tensor
images. A Tensor Image is a tensor with ``(C, H, W)`` shape, where ``C`` is a
number of channels, ``H`` and ``W`` are image height and width. A batch of
Tensor Images is a tensor of ``(B, C, H, W)`` shape, where ``B`` is a number
of images in the batch.

29
The expected range of the values of a tensor image is implicitly defined by
30
31
32
33
34
35
36
37
38
the tensor dtype. Tensor images with a float dtype are expected to have
values in ``[0, 1)``. Tensor images with an integer dtype are expected to
have values in ``[0, MAX_DTYPE]`` where ``MAX_DTYPE`` is the largest value
that can be represented in that dtype.

Randomized transformations will apply the same transformation to all the
images of a given batch, but they will produce different transformations
across calls. For reproducible transformations across calls, you may use
:ref:`functional transforms <functional_transforms>`.
39

40
The following examples illustrate the use of the available transforms:
41
42
43
44
45
46
47
48
49
50
51
52
53

    * :ref:`sphx_glr_auto_examples_plot_transforms.py`

        .. figure:: ../source/auto_examples/images/sphx_glr_plot_transforms_001.png
            :align: center
            :scale: 65%

    * :ref:`sphx_glr_auto_examples_plot_scripted_tensor_transforms.py`

        .. figure:: ../source/auto_examples/images/sphx_glr_plot_scripted_tensor_transforms_001.png
            :align: center
            :scale: 30%

54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
.. warning::

    Since v0.8.0 all random transformations are using torch default random generator to sample random parameters.
    It is a backward compatibility breaking change and user should set the random state as following:

    .. code:: python

        # Previous versions
        # import random
        # random.seed(12)

        # Now
        import torch
        torch.manual_seed(17)

    Please, keep in mind that the same seed for torch random generator and Python random generator will not
    produce the same results.

72
73

Scriptable transforms
vfdev's avatar
vfdev committed
74
---------------------
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

In order to script the transformations, please use ``torch.nn.Sequential`` instead of :class:`Compose`.

.. code:: python

    transforms = torch.nn.Sequential(
        transforms.CenterCrop(10),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
    )
    scripted_transforms = torch.jit.script(transforms)

Make sure to use only scriptable transformations, i.e. that work with ``torch.Tensor`` and does not require
`lambda` functions or ``PIL.Image``.

For any custom transformations to be used with ``torch.jit.script``, they should be derived from ``torch.nn.Module``.

91

vfdev's avatar
vfdev committed
92
93
Compositions of transforms
--------------------------
94

95
96
97
98
99
.. autosummary::
    :toctree: generated/
    :template: class.rst

    Compose
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
100

101

vfdev's avatar
vfdev committed
102
103
Transforms on PIL Image and torch.\*Tensor
------------------------------------------
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
104

105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
.. autosummary::
    :toctree: generated/
    :template: class.rst

    CenterCrop
    ColorJitter
    FiveCrop
    Grayscale
    Pad
    RandomAffine
    RandomApply
    RandomCrop
    RandomGrayscale
    RandomHorizontalFlip
    RandomPerspective
    RandomResizedCrop
    RandomRotation
    RandomVerticalFlip
    Resize
    TenCrop
    GaussianBlur
    RandomInvert
    RandomPosterize
    RandomSolarize
    RandomAdjustSharpness
    RandomAutocontrast
    RandomEqualize
132

133

134
135
.. _transforms_pil_only:

vfdev's avatar
vfdev committed
136
Transforms on PIL Image only
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
137
138
----------------------------

139
140
141
.. autosummary::
    :toctree: generated/
    :template: class.rst
vfdev's avatar
vfdev committed
142

143
144
    RandomChoice
    RandomOrder
vfdev's avatar
vfdev committed
145

146
.. _transforms_tensor_only:
vfdev's avatar
vfdev committed
147
148
149
150

Transforms on torch.\*Tensor only
---------------------------------

151
152
153
.. autosummary::
    :toctree: generated/
    :template: class.rst
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
154

155
156
157
158
    LinearTransformation
    Normalize
    RandomErasing
    ConvertImageDtype
vfdev's avatar
vfdev committed
159

160
.. _conversion_transforms:
161

Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
162
163
164
Conversion Transforms
---------------------

165
166
167
.. autosummary::
    :toctree: generated/
    :template: class.rst
vfdev's avatar
vfdev committed
168

169
170
171
    ToPILImage
    ToTensor
    PILToTensor
Nicolas Hug's avatar
Nicolas Hug committed
172

Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
173
174
175
176

Generic Transforms
------------------

177
178
179
180
181
.. autosummary::
    :toctree: generated/
    :template: class.rst

    Lambda
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
182

183

184
185
Automatic Augmentation Transforms
---------------------------------
186
187
188
189
190
191
192

`AutoAugment <https://arxiv.org/pdf/1805.09501.pdf>`_ is a common Data Augmentation technique that can improve the accuracy of Image Classification models.
Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that
ImageNet policies provide significant improvements when applied to other datasets.
In TorchVision we implemented 3 policies learned on the following datasets: ImageNet, CIFAR10 and SVHN.
The new transform can be used standalone or mixed-and-matched with existing transforms:

193
194
195
.. autosummary::
    :toctree: generated/
    :template: class.rst
196

197
198
199
200
    AutoAugmentPolicy
    AutoAugment
    RandAugment
    TrivialAugmentWide
201
    AugMix
202

203
204
.. _functional_transforms:

205
206
207
Functional Transforms
---------------------

208
209
.. currentmodule:: torchvision.transforms.functional

210
211
212
Functional transforms give you fine-grained control of the transformation pipeline.
As opposed to the transformations above, functional transforms don't contain a random number
generator for their parameters.
213
214
That means you have to specify/generate all parameters, but the functional transform will give you
reproducible results across calls.
215
216
217

Example:
you can apply a functional transform with the same parameters to multiple images like this:
218
219
220
221
222
223
224

.. code:: python

    import torchvision.transforms.functional as TF
    import random

    def my_segmentation_transforms(image, segmentation):
225
        if random.random() > 0.5:
226
227
228
229
230
231
            angle = random.randint(-30, 30)
            image = TF.rotate(image, angle)
            segmentation = TF.rotate(segmentation, angle)
        # more transforms ...
        return image, segmentation

232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253

Example:
you can use a functional transform to build transform classes with custom behavior:

.. code:: python

    import torchvision.transforms.functional as TF
    import random

    class MyRotationTransform:
        """Rotate by one of the given angles."""

        def __init__(self, angles):
            self.angles = angles

        def __call__(self, x):
            angle = random.choice(self.angles)
            return TF.rotate(x, angle)

    rotation_transform = MyRotationTransform(angles=[-30, -15, 0, 15, 30])


254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
.. autosummary::
    :toctree: generated/
    :template: function.rst

    adjust_brightness
    adjust_contrast
    adjust_gamma
    adjust_hue
    adjust_saturation
    adjust_sharpness
    affine
    autocontrast
    center_crop
    convert_image_dtype
    crop
    equalize
    erase
    five_crop
    gaussian_blur
273
    get_dimensions
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
    get_image_num_channels
    get_image_size
    hflip
    invert
    normalize
    pad
    perspective
    pil_to_tensor
    posterize
    resize
    resized_crop
    rgb_to_grayscale
    rotate
    solarize
    ten_crop
    to_grayscale
    to_pil_image
    to_tensor
    vflip