".github/vscode:/vscode.git/clone" did not exist on "571e3e5fc75c23b45cbd9b00011af094357c5f1d"
transforms.rst 18.7 KB
Newer Older
1
2
.. _transforms:

3
4
Transforming and augmenting images
==================================
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
5
6
7

.. currentmodule:: torchvision.transforms

Nicolas Hug's avatar
Nicolas Hug committed
8
9
10
11
Torchvision supports common computer vision transformations in the
``torchvision.transforms`` and ``torchvision.transforms.v2`` modules. Transforms
can be used to transform or augment data for training or inference of different
tasks (image classification, detection, segmentation, video classification).
12

Nicolas Hug's avatar
Nicolas Hug committed
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
.. code:: python

    # Image Classification
    import torch
    from torchvision.transforms import v2

    H, W = 32, 32
    img = torch.randint(0, 256, size=(3, H, W), dtype=torch.uint8)

    transforms = v2.Compose([
        v2.RandomResizedCrop(size=(224, 224), antialias=True),
        v2.RandomHorizontalFlip(p=0.5),
        v2.ToDtype(torch.float32, scale=True),
        v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])
    img = transforms(img)

.. code:: python

    # Detection (re-using imports and transforms from above)
33
    from torchvision import tv_tensors
Nicolas Hug's avatar
Nicolas Hug committed
34
35

    img = torch.randint(0, 256, size=(3, H, W), dtype=torch.uint8)
Nicolas Hug's avatar
Nicolas Hug committed
36
37
38
    boxes = torch.randint(0, H // 2, size=(3, 4))
    boxes[:, 2:] += boxes[:, :2]
    boxes = tv_tensors.BoundingBoxes(boxes, format="XYXY", canvas_size=(H, W))
Nicolas Hug's avatar
Nicolas Hug committed
39
40

    # The same transforms can be used!
Nicolas Hug's avatar
Nicolas Hug committed
41
    img, boxes = transforms(img, boxes)
Nicolas Hug's avatar
Nicolas Hug committed
42
    # And you can pass arbitrary input structures
Nicolas Hug's avatar
Nicolas Hug committed
43
    output_dict = transforms({"image": img, "boxes": boxes})
Nicolas Hug's avatar
Nicolas Hug committed
44
45
46
47

Transforms are typically passed as the ``transform`` or ``transforms`` argument
to the :ref:`Datasets <datasets>`.

Nicolas Hug's avatar
Nicolas Hug committed
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
Start here
----------

Whether you're new to Torchvision transforms, or you're already experienced with
them, we encourage you to start with
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py` in
order to learn more about what can be done with the new v2 transforms.

Then, browse the sections in below this page for general information and
performance tips. The available transforms and functionals are listed in the
:ref:`API reference <v2_api_ref>`.

More information and tutorials can also be found in our :ref:`example gallery
<gallery>`, e.g. :ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`
or :ref:`sphx_glr_auto_examples_transforms_plot_custom_transforms.py`.
Nicolas Hug's avatar
Nicolas Hug committed
63

64
65
.. _conventions:

Nicolas Hug's avatar
Nicolas Hug committed
66
67
Supported input types and conventions
-------------------------------------
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
68

69
Most transformations accept both `PIL <https://pillow.readthedocs.io>`_ images
70
71
and tensor inputs. Both CPU and CUDA tensors are supported.
The result of both backends (PIL or Tensors) should be very
Nicolas Hug's avatar
Nicolas Hug committed
72
73
74
75
76
77
78
79
80
81
82
close. In general, we recommend relying on the tensor backend :ref:`for
performance <transforms_perf>`.  The :ref:`conversion transforms
<conversion_transforms>` may be used to convert to and from PIL images, or for
converting dtypes and ranges.

Tensor image are expected to be of shape ``(C, H, W)``, where ``C`` is the
number of channels, and ``H`` and ``W`` refer to height and width. Most
transforms support batched tensor input. A batch of Tensor images is a tensor of
shape ``(N, C, H, W)``, where ``N`` is a number of images in the batch. The
:ref:`v2 <v1_or_v2>` transforms generally accept an arbitrary number of leading
dimensions ``(..., C, H, W)`` and can handle batched images or batched videos.
83

Nicolas Hug's avatar
Nicolas Hug committed
84
85
86
87
.. _range_and_dtype:

Dtype and expected value range
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
88

89
The expected range of the values of a tensor image is implicitly defined by
90
the tensor dtype. Tensor images with a float dtype are expected to have
Nicolas Hug's avatar
Nicolas Hug committed
91
values in ``[0, 1]``. Tensor images with an integer dtype are expected to
92
have values in ``[0, MAX_DTYPE]`` where ``MAX_DTYPE`` is the largest value
Nicolas Hug's avatar
Nicolas Hug committed
93
94
that can be represented in that dtype. Typically, images of dtype
``torch.uint8`` are expected to have values in ``[0, 255]``.
95

Nicolas Hug's avatar
Nicolas Hug committed
96
97
Use :class:`~torchvision.transforms.v2.ToDtype` to convert both the dtype and
range of the inputs.
98

Nicolas Hug's avatar
Nicolas Hug committed
99
.. _v1_or_v2:
100

Nicolas Hug's avatar
Nicolas Hug committed
101
102
V1 or V2? Which one should I use?
---------------------------------
103

Nicolas Hug's avatar
Nicolas Hug committed
104
105
**TL;DR** We recommending using the ``torchvision.transforms.v2`` transforms
instead of those in ``torchvision.transforms``. They're faster and they can do
106
107
108
more things. Just change the import and you should be good to go. Moving
forward, new features and improvements will only be considered for the v2
transforms.
109

Nicolas Hug's avatar
Nicolas Hug committed
110
111
112
In Torchvision 0.15 (March 2023), we released a new set of transforms available
in the ``torchvision.transforms.v2`` namespace. These transforms have a lot of
advantages compared to the v1 ones (in ``torchvision.transforms``):
113

Nicolas Hug's avatar
Nicolas Hug committed
114
115
- They can transform images **but also** bounding boxes, masks, or videos. This
  provides support for tasks beyond image classification: detection, segmentation,
Nicolas Hug's avatar
Nicolas Hug committed
116
117
118
  video classification, etc. See
  :ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py`
  and :ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`.
Nicolas Hug's avatar
Nicolas Hug committed
119
- They support more transforms like :class:`~torchvision.transforms.v2.CutMix`
Nicolas Hug's avatar
Nicolas Hug committed
120
121
  and :class:`~torchvision.transforms.v2.MixUp`. See
  :ref:`sphx_glr_auto_examples_transforms_plot_cutmix_mixup.py`.
Nicolas Hug's avatar
Nicolas Hug committed
122
123
124
- They're :ref:`faster <transforms_perf>`.
- They support arbitrary input structures (dicts, lists, tuples, etc.).
- Future improvements and features will be added to the v2 transforms only.
125

Nicolas Hug's avatar
Nicolas Hug committed
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
These transforms are **fully backward compatible** with the v1 ones, so if
you're already using tranforms from ``torchvision.transforms``, all you need to
do to is to update the import to ``torchvision.transforms.v2``. In terms of
output, there might be negligible differences due to implementation differences.

.. _transforms_perf:

Performance considerations
--------------------------

We recommend the following guidelines to get the best performance out of the
transforms:

- Rely on the v2 transforms from ``torchvision.transforms.v2``
- Use tensors instead of PIL images
- Use ``torch.uint8`` dtype, especially for resizing
- Resize with bilinear or bicubic mode

This is what a typical transform pipeline could look like:

.. code:: python

    from torchvision.transforms import v2
    transforms = v2.Compose([
        v2.ToImage(),  # Convert to tensor, only needed if you had a PIL image
        v2.ToDtype(torch.uint8, scale=True),  # optional, most input are already uint8 at this point
        # ...
        v2.RandomResizedCrop(size=(224, 224), antialias=True),  # Or Resize(antialias=True)
        # ...
        v2.ToDtype(torch.float32, scale=True),  # Normalize expects float input
        v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])

The above should give you the best performance in a typical training environment
that relies on the :class:`torch.utils.data.DataLoader` with ``num_workers >
0``.

163
Transforms tend to be sensitive to the input strides / memory format. Some
Nicolas Hug's avatar
Nicolas Hug committed
164
transforms will be faster with channels-first images while others prefer
165
166
167
168
169
channels-last. Like ``torch`` operators, most transforms will preserve the
memory format of the input, but this may not always be respected due to
implementation details. You may want to experiment a bit if you're chasing the
very best performance.  Using :func:`torch.compile` on individual transforms may
also help factoring out the memory format variable (e.g. on
Nicolas Hug's avatar
Nicolas Hug committed
170
:class:`~torchvision.transforms.v2.Normalize`). Note that we're talking about
171
**memory format**, not :ref:`tensor shape <conventions>`.
Nicolas Hug's avatar
Nicolas Hug committed
172
173
174
175
176
177
178

Note that resize transforms like :class:`~torchvision.transforms.v2.Resize`
and :class:`~torchvision.transforms.v2.RandomResizedCrop` typically prefer
channels-last input and tend **not** to benefit from :func:`torch.compile` at
this time.

.. _functional_transforms:
179

Nicolas Hug's avatar
Nicolas Hug committed
180
181
Transform classes, functionals, and kernels
-------------------------------------------
182

Nicolas Hug's avatar
Nicolas Hug committed
183
184
185
186
187
188
Transforms are available as classes like
:class:`~torchvision.transforms.v2.Resize`, but also as functionals like
:func:`~torchvision.transforms.v2.functional.resize` in the
``torchvision.transforms.v2.functional`` namespace.
This is very much like the :mod:`torch.nn` package which defines both classes
and functional equivalents in :mod:`torch.nn.functional`.
189

190
The functionals support PIL images, pure tensors, or :ref:`TVTensors
Nicolas Hug's avatar
Nicolas Hug committed
191
<tv_tensors>`, e.g. both ``resize(image_tensor)`` and ``resize(boxes)`` are
Nicolas Hug's avatar
Nicolas Hug committed
192
valid.
193

Nicolas Hug's avatar
Nicolas Hug committed
194
195
196
197
198
199
200
201
.. note::

    Random transforms like :class:`~torchvision.transforms.v2.RandomCrop` will
    randomly sample some parameter each time they're called. Their functional
    counterpart (:func:`~torchvision.transforms.v2.functional.crop`) does not do
    any kind of random sampling and thus have a slighlty different
    parametrization. The ``get_params()`` class method of the transforms class
    can be used to perform parameter sampling when using the functional APIs.
202
203


Nicolas Hug's avatar
Nicolas Hug committed
204
205
206
207
208
209
210
211
212
213
The ``torchvision.transforms.v2.functional`` namespace also contains what we
call the "kernels". These are the low-level functions that implement the
core functionalities for specific types, e.g. ``resize_bounding_boxes`` or
```resized_crop_mask``. They are public, although not documented. Check the
`code
<https://github.com/pytorch/vision/blob/main/torchvision/transforms/v2/functional/__init__.py>`_
to see which ones are available (note that those starting with a leading
underscore are **not** public!). Kernels are only really useful if you want
:ref:`torchscript support <transforms_torchscript>` for types like bounding
boxes or masks.
214

Nicolas Hug's avatar
Nicolas Hug committed
215
.. _transforms_torchscript:
216

Nicolas Hug's avatar
Nicolas Hug committed
217
218
Torchscript support
-------------------
219

Nicolas Hug's avatar
Nicolas Hug committed
220
Most transform classes and functionals support torchscript. For composing
Nicolas Hug's avatar
Nicolas Hug committed
221
222
transforms, use :class:`torch.nn.Sequential` instead of
:class:`~torchvision.transforms.v2.Compose`:
223
224
225
226

.. code:: python

    transforms = torch.nn.Sequential(
Nicolas Hug's avatar
Nicolas Hug committed
227
228
        CenterCrop(10),
        Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
229
230
231
    )
    scripted_transforms = torch.jit.script(transforms)

Nicolas Hug's avatar
Nicolas Hug committed
232
233
234
235
236
237
238
239
.. warning::

    v2 transforms support torchscript, but if you call ``torch.jit.script()`` on
    a v2 **class** transform, you'll actually end up with its (scripted) v1
    equivalent.  This may lead to slightly different results between the
    scripted and eager executions due to implementation differences between v1
    and v2.

Nicolas Hug's avatar
Nicolas Hug committed
240
    If you really need torchscript support for the v2 transforms, we recommend
Nicolas Hug's avatar
Nicolas Hug committed
241
242
243
244
245
246
247
248
    scripting the **functionals** from the
    ``torchvision.transforms.v2.functional`` namespace to avoid surprises.


Also note that the functionals only support torchscript for pure tensors, which
are always treated as images. If you need torchscript support for other types
like bounding boxes or masks, you can rely on the :ref:`low-level kernels
<functional_transforms>`.
249

Nicolas Hug's avatar
Nicolas Hug committed
250
251
252
253
For any custom transformations to be used with ``torch.jit.script``, they should
be derived from ``torch.nn.Module``.

See also: :ref:`sphx_glr_auto_examples_others_plot_scripted_tensor_transforms.py`.
254

Nicolas Hug's avatar
Nicolas Hug committed
255
256
.. _v2_api_ref:

Nicolas Hug's avatar
Nicolas Hug committed
257
258
V2 API reference - Recommended
------------------------------
259

260
Geometry
Nicolas Hug's avatar
Nicolas Hug committed
261
262
263
264
^^^^^^^^

Resizing
""""""""
265

266
267
268
269
.. autosummary::
    :toctree: generated/
    :template: class.rst

270
    v2.Resize
271
272
273
    v2.ScaleJitter
    v2.RandomShortestSize
    v2.RandomResize
Nicolas Hug's avatar
Nicolas Hug committed
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289

Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.resize

Cropping
""""""""

.. autosummary::
    :toctree: generated/
    :template: class.rst

290
291
    v2.RandomCrop
    v2.RandomResizedCrop
292
    v2.RandomIoUCrop
293
294
295
    v2.CenterCrop
    v2.FiveCrop
    v2.TenCrop
Nicolas Hug's avatar
Nicolas Hug committed
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317

Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.crop
    v2.functional.resized_crop
    v2.functional.ten_crop
    v2.functional.center_crop
    v2.functional.five_crop

Others
""""""

.. autosummary::
    :toctree: generated/
    :template: class.rst

    v2.RandomHorizontalFlip
    v2.RandomVerticalFlip
318
    v2.Pad
319
320
    v2.RandomZoomOut
    v2.RandomRotation
321
322
    v2.RandomAffine
    v2.RandomPerspective
323
    v2.ElasticTransform
324

Nicolas Hug's avatar
Nicolas Hug committed
325
326
327
328
329
330
331
332
333
334
335
336
337
Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.horizontal_flip
    v2.functional.vertical_flip
    v2.functional.pad
    v2.functional.rotate
    v2.functional.affine
    v2.functional.perspective
    v2.functional.elastic
338

339
Color
Nicolas Hug's avatar
Nicolas Hug committed
340
^^^^^
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
341

342
343
344
345
.. autosummary::
    :toctree: generated/
    :template: class.rst

346
    v2.ColorJitter
347
    v2.RandomChannelPermutation
348
    v2.RandomPhotometricDistort
349
    v2.Grayscale
350
    v2.RGB
351
352
    v2.RandomGrayscale
    v2.GaussianBlur
353
    v2.GaussianNoise
354
355
356
357
358
359
    v2.RandomInvert
    v2.RandomPosterize
    v2.RandomSolarize
    v2.RandomAdjustSharpness
    v2.RandomAutocontrast
    v2.RandomEqualize
360

Nicolas Hug's avatar
Nicolas Hug committed
361
362
363
364
365
366
367
368
Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.permute_channels
    v2.functional.rgb_to_grayscale
369
    v2.functional.grayscale_to_rgb
Nicolas Hug's avatar
Nicolas Hug committed
370
371
    v2.functional.to_grayscale
    v2.functional.gaussian_blur
372
    v2.functional.gaussian_noise
Nicolas Hug's avatar
Nicolas Hug committed
373
374
375
376
377
378
379
380
381
382
383
384
385
    v2.functional.invert
    v2.functional.posterize
    v2.functional.solarize
    v2.functional.adjust_sharpness
    v2.functional.autocontrast
    v2.functional.adjust_contrast
    v2.functional.equalize
    v2.functional.adjust_brightness
    v2.functional.adjust_saturation
    v2.functional.adjust_hue
    v2.functional.adjust_gamma


386
Composition
Nicolas Hug's avatar
Nicolas Hug committed
387
^^^^^^^^^^^
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
388

389
390
391
.. autosummary::
    :toctree: generated/
    :template: class.rst
vfdev's avatar
vfdev committed
392

393
394
395
396
    v2.Compose
    v2.RandomApply
    v2.RandomChoice
    v2.RandomOrder
vfdev's avatar
vfdev committed
397

398
Miscellaneous
Nicolas Hug's avatar
Nicolas Hug committed
399
^^^^^^^^^^^^^
vfdev's avatar
vfdev committed
400

401
402
403
.. autosummary::
    :toctree: generated/
    :template: class.rst
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
404

405
406
407
408
    v2.LinearTransformation
    v2.Normalize
    v2.RandomErasing
    v2.Lambda
409
410
    v2.SanitizeBoundingBoxes
    v2.ClampBoundingBoxes
411
    v2.UniformTemporalSubsample
Thien Tran's avatar
Thien Tran committed
412
    v2.JPEG
vfdev's avatar
vfdev committed
413

Nicolas Hug's avatar
Nicolas Hug committed
414
415
416
417
418
419
420
421
Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.normalize
    v2.functional.erase
422
    v2.functional.sanitize_bounding_boxes
Nicolas Hug's avatar
Nicolas Hug committed
423
424
    v2.functional.clamp_bounding_boxes
    v2.functional.uniform_temporal_subsample
Thien Tran's avatar
Thien Tran committed
425
    v2.functional.jpeg
Nicolas Hug's avatar
Nicolas Hug committed
426

427
.. _conversion_transforms:
428

429
Conversion
Nicolas Hug's avatar
Nicolas Hug committed
430
^^^^^^^^^^
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
431

Nicolas Hug's avatar
Nicolas Hug committed
432
433
434
435
.. note::
    Beware, some of these conversion transforms below will scale the values
    while performing the conversion, while some may not do any scaling. By
    scaling, we mean e.g. that a ``uint8`` -> ``float32`` would map the [0,
Nicolas Hug's avatar
Nicolas Hug committed
436
437
    255] range into [0, 1] (and vice-versa). See :ref:`range_and_dtype`.

438
439
440
.. autosummary::
    :toctree: generated/
    :template: class.rst
vfdev's avatar
vfdev committed
441

442
    v2.ToImage
Nicolas Hug's avatar
Nicolas Hug committed
443
444
445
    v2.ToPureTensor
    v2.PILToTensor
    v2.ToPILImage
Nicolas Hug's avatar
Nicolas Hug committed
446
    v2.ToDtype
vfdev's avatar
vfdev committed
447
    v2.ConvertBoundingBoxFormat
Nicolas Hug's avatar
Nicolas Hug committed
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471

functionals

.. autosummary::
    :toctree: generated/
    :template: functional.rst

    v2.functional.to_image
    v2.functional.pil_to_tensor
    v2.functional.to_pil_image
    v2.functional.to_dtype
    v2.functional.convert_bounding_box_format


Deprecated

.. autosummary::
    :toctree: generated/
    :template: class.rst

    v2.ToTensor
    v2.functional.to_tensor
    v2.ConvertImageDtype
    v2.functional.convert_image_dtype
Nicolas Hug's avatar
Nicolas Hug committed
472

473
Auto-Augmentation
Nicolas Hug's avatar
Nicolas Hug committed
474
^^^^^^^^^^^^^^^^^
475
476
477
478
479
480
481

`AutoAugment <https://arxiv.org/pdf/1805.09501.pdf>`_ is a common Data Augmentation technique that can improve the accuracy of Image Classification models.
Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that
ImageNet policies provide significant improvements when applied to other datasets.
In TorchVision we implemented 3 policies learned on the following datasets: ImageNet, CIFAR10 and SVHN.
The new transform can be used standalone or mixed-and-matched with existing transforms:

482
483
484
.. autosummary::
    :toctree: generated/
    :template: class.rst
485

486
487
488
489
    v2.AutoAugment
    v2.RandAugment
    v2.TrivialAugmentWide
    v2.AugMix
490

Nicolas Hug's avatar
Nicolas Hug committed
491

492
CutMix - MixUp
Nicolas Hug's avatar
Nicolas Hug committed
493
^^^^^^^^^^^^^^
494

495
CutMix and MixUp are special transforms that
496
are meant to be used on batches rather than on individual images, because they
497
498
are combining pairs of images together. These can be used after the dataloader
(once the samples are batched), or part of a collation function. See
Nicolas Hug's avatar
Nicolas Hug committed
499
:ref:`sphx_glr_auto_examples_transforms_plot_cutmix_mixup.py` for detailed usage examples.
500
501
502
503
504

.. autosummary::
    :toctree: generated/
    :template: class.rst

Nicolas Hug's avatar
Nicolas Hug committed
505
506
    v2.CutMix
    v2.MixUp
507

Nicolas Hug's avatar
Nicolas Hug committed
508
509
Developer tools
^^^^^^^^^^^^^^^
510

Nicolas Hug's avatar
Nicolas Hug committed
511
512
513
.. autosummary::
    :toctree: generated/
    :template: function.rst
514

Nicolas Hug's avatar
Nicolas Hug committed
515
    v2.functional.register_kernel
516

517

Nicolas Hug's avatar
Nicolas Hug committed
518
519
V1 API Reference
----------------
520

Nicolas Hug's avatar
Nicolas Hug committed
521
522
Geometry
^^^^^^^^
523

Nicolas Hug's avatar
Nicolas Hug committed
524
525
526
.. autosummary::
    :toctree: generated/
    :template: class.rst
527

Nicolas Hug's avatar
Nicolas Hug committed
528
529
530
531
532
533
534
535
536
537
538
539
540
    Resize
    RandomCrop
    RandomResizedCrop
    CenterCrop
    FiveCrop
    TenCrop
    Pad
    RandomRotation
    RandomAffine
    RandomPerspective
    ElasticTransform
    RandomHorizontalFlip
    RandomVerticalFlip
541
542


Nicolas Hug's avatar
Nicolas Hug committed
543
544
Color
^^^^^
545

Nicolas Hug's avatar
Nicolas Hug committed
546
547
548
.. autosummary::
    :toctree: generated/
    :template: class.rst
549

Nicolas Hug's avatar
Nicolas Hug committed
550
551
552
553
554
555
556
557
558
559
    ColorJitter
    Grayscale
    RandomGrayscale
    GaussianBlur
    RandomInvert
    RandomPosterize
    RandomSolarize
    RandomAdjustSharpness
    RandomAutocontrast
    RandomEqualize
560

Nicolas Hug's avatar
Nicolas Hug committed
561
562
Composition
^^^^^^^^^^^
563

Nicolas Hug's avatar
Nicolas Hug committed
564
565
566
.. autosummary::
    :toctree: generated/
    :template: class.rst
567

Nicolas Hug's avatar
Nicolas Hug committed
568
569
570
571
    Compose
    RandomApply
    RandomChoice
    RandomOrder
572

Nicolas Hug's avatar
Nicolas Hug committed
573
574
Miscellaneous
^^^^^^^^^^^^^
575

Nicolas Hug's avatar
Nicolas Hug committed
576
577
578
.. autosummary::
    :toctree: generated/
    :template: class.rst
579

Nicolas Hug's avatar
Nicolas Hug committed
580
581
582
583
    LinearTransformation
    Normalize
    RandomErasing
    Lambda
584

Nicolas Hug's avatar
Nicolas Hug committed
585
586
587
588
589
590
591
592
Conversion
^^^^^^^^^^

.. note::
    Beware, some of these conversion transforms below will scale the values
    while performing the conversion, while some may not do any scaling. By
    scaling, we mean e.g. that a ``uint8`` -> ``float32`` would map the [0,
    255] range into [0, 1] (and vice-versa). See :ref:`range_and_dtype`.
593

Nicolas Hug's avatar
Nicolas Hug committed
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
.. autosummary::
    :toctree: generated/
    :template: class.rst

    ToPILImage
    ToTensor
    PILToTensor
    ConvertImageDtype

Auto-Augmentation
^^^^^^^^^^^^^^^^^

`AutoAugment <https://arxiv.org/pdf/1805.09501.pdf>`_ is a common Data Augmentation technique that can improve the accuracy of Image Classification models.
Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that
ImageNet policies provide significant improvements when applied to other datasets.
In TorchVision we implemented 3 policies learned on the following datasets: ImageNet, CIFAR10 and SVHN.
The new transform can be used standalone or mixed-and-matched with existing transforms:

.. autosummary::
    :toctree: generated/
    :template: class.rst

    AutoAugmentPolicy
    AutoAugment
    RandAugment
    TrivialAugmentWide
    AugMix



Functional Transforms
^^^^^^^^^^^^^^^^^^^^^

.. currentmodule:: torchvision.transforms.functional
628

629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
.. autosummary::
    :toctree: generated/
    :template: function.rst

    adjust_brightness
    adjust_contrast
    adjust_gamma
    adjust_hue
    adjust_saturation
    adjust_sharpness
    affine
    autocontrast
    center_crop
    convert_image_dtype
    crop
    equalize
    erase
    five_crop
    gaussian_blur
648
    get_dimensions
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
    get_image_num_channels
    get_image_size
    hflip
    invert
    normalize
    pad
    perspective
    pil_to_tensor
    posterize
    resize
    resized_crop
    rgb_to_grayscale
    rotate
    solarize
    ten_crop
    to_grayscale
    to_pil_image
    to_tensor
    vflip