transforms.rst 18.7 KB
Newer Older
1
2
.. _transforms:

3
4
Transforming and augmenting images
==================================
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
5
6
7

.. currentmodule:: torchvision.transforms

Nicolas Hug's avatar
Nicolas Hug committed
8
9
10
11
Torchvision supports common computer vision transformations in the
``torchvision.transforms`` and ``torchvision.transforms.v2`` modules. Transforms
can be used to transform or augment data for training or inference of different
tasks (image classification, detection, segmentation, video classification).
12

Nicolas Hug's avatar
Nicolas Hug committed
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
.. code:: python

    # Image Classification
    import torch
    from torchvision.transforms import v2

    H, W = 32, 32
    img = torch.randint(0, 256, size=(3, H, W), dtype=torch.uint8)

    transforms = v2.Compose([
        v2.RandomResizedCrop(size=(224, 224), antialias=True),
        v2.RandomHorizontalFlip(p=0.5),
        v2.ToDtype(torch.float32, scale=True),
        v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])
    img = transforms(img)

.. code:: python

    # Detection (re-using imports and transforms from above)
33
    from torchvision import tv_tensors
Nicolas Hug's avatar
Nicolas Hug committed
34
35

    img = torch.randint(0, 256, size=(3, H, W), dtype=torch.uint8)
Nicolas Hug's avatar
Nicolas Hug committed
36
37
38
    boxes = torch.randint(0, H // 2, size=(3, 4))
    boxes[:, 2:] += boxes[:, :2]
    boxes = tv_tensors.BoundingBoxes(boxes, format="XYXY", canvas_size=(H, W))
Nicolas Hug's avatar
Nicolas Hug committed
39
40

    # The same transforms can be used!
Nicolas Hug's avatar
Nicolas Hug committed
41
    img, boxes = transforms(img, boxes)
Nicolas Hug's avatar
Nicolas Hug committed
42
    # And you can pass arbitrary input structures
Nicolas Hug's avatar
Nicolas Hug committed
43
    output_dict = transforms({"image": img, "boxes": boxes})
Nicolas Hug's avatar
Nicolas Hug committed
44
45
46
47

Transforms are typically passed as the ``transform`` or ``transforms`` argument
to the :ref:`Datasets <datasets>`.

Nicolas Hug's avatar
Nicolas Hug committed
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
Start here
----------

Whether you're new to Torchvision transforms, or you're already experienced with
them, we encourage you to start with
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py` in
order to learn more about what can be done with the new v2 transforms.

Then, browse the sections in below this page for general information and
performance tips. The available transforms and functionals are listed in the
:ref:`API reference <v2_api_ref>`.

More information and tutorials can also be found in our :ref:`example gallery
<gallery>`, e.g. :ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`
or :ref:`sphx_glr_auto_examples_transforms_plot_custom_transforms.py`.
Nicolas Hug's avatar
Nicolas Hug committed
63

64
65
.. _conventions:

Nicolas Hug's avatar
Nicolas Hug committed
66
67
Supported input types and conventions
-------------------------------------
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
68

69
Most transformations accept both `PIL <https://pillow.readthedocs.io>`_ images
70
71
and tensor inputs. Both CPU and CUDA tensors are supported.
The result of both backends (PIL or Tensors) should be very
Nicolas Hug's avatar
Nicolas Hug committed
72
73
74
75
76
77
78
79
80
81
82
close. In general, we recommend relying on the tensor backend :ref:`for
performance <transforms_perf>`.  The :ref:`conversion transforms
<conversion_transforms>` may be used to convert to and from PIL images, or for
converting dtypes and ranges.

Tensor image are expected to be of shape ``(C, H, W)``, where ``C`` is the
number of channels, and ``H`` and ``W`` refer to height and width. Most
transforms support batched tensor input. A batch of Tensor images is a tensor of
shape ``(N, C, H, W)``, where ``N`` is a number of images in the batch. The
:ref:`v2 <v1_or_v2>` transforms generally accept an arbitrary number of leading
dimensions ``(..., C, H, W)`` and can handle batched images or batched videos.
83

Nicolas Hug's avatar
Nicolas Hug committed
84
85
86
87
.. _range_and_dtype:

Dtype and expected value range
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
88

89
The expected range of the values of a tensor image is implicitly defined by
90
the tensor dtype. Tensor images with a float dtype are expected to have
Nicolas Hug's avatar
Nicolas Hug committed
91
values in ``[0, 1]``. Tensor images with an integer dtype are expected to
92
have values in ``[0, MAX_DTYPE]`` where ``MAX_DTYPE`` is the largest value
Nicolas Hug's avatar
Nicolas Hug committed
93
94
that can be represented in that dtype. Typically, images of dtype
``torch.uint8`` are expected to have values in ``[0, 255]``.
95

Nicolas Hug's avatar
Nicolas Hug committed
96
97
Use :class:`~torchvision.transforms.v2.ToDtype` to convert both the dtype and
range of the inputs.
98

Nicolas Hug's avatar
Nicolas Hug committed
99
.. _v1_or_v2:
100

Nicolas Hug's avatar
Nicolas Hug committed
101
102
V1 or V2? Which one should I use?
---------------------------------
103

Nicolas Hug's avatar
Nicolas Hug committed
104
105
106
**TL;DR** We recommending using the ``torchvision.transforms.v2`` transforms
instead of those in ``torchvision.transforms``. They're faster and they can do
more things. Just change the import and you should be good to go.
107

Nicolas Hug's avatar
Nicolas Hug committed
108
109
110
In Torchvision 0.15 (March 2023), we released a new set of transforms available
in the ``torchvision.transforms.v2`` namespace. These transforms have a lot of
advantages compared to the v1 ones (in ``torchvision.transforms``):
111

Nicolas Hug's avatar
Nicolas Hug committed
112
113
- They can transform images **but also** bounding boxes, masks, or videos. This
  provides support for tasks beyond image classification: detection, segmentation,
Nicolas Hug's avatar
Nicolas Hug committed
114
115
116
  video classification, etc. See
  :ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py`
  and :ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`.
Nicolas Hug's avatar
Nicolas Hug committed
117
- They support more transforms like :class:`~torchvision.transforms.v2.CutMix`
Nicolas Hug's avatar
Nicolas Hug committed
118
119
  and :class:`~torchvision.transforms.v2.MixUp`. See
  :ref:`sphx_glr_auto_examples_transforms_plot_cutmix_mixup.py`.
Nicolas Hug's avatar
Nicolas Hug committed
120
121
122
- They're :ref:`faster <transforms_perf>`.
- They support arbitrary input structures (dicts, lists, tuples, etc.).
- Future improvements and features will be added to the v2 transforms only.
123

Nicolas Hug's avatar
Nicolas Hug committed
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
These transforms are **fully backward compatible** with the v1 ones, so if
you're already using tranforms from ``torchvision.transforms``, all you need to
do to is to update the import to ``torchvision.transforms.v2``. In terms of
output, there might be negligible differences due to implementation differences.

.. note::

    The v2 transforms are still BETA, but at this point we do not expect
    disruptive changes to be made to their public APIs. We're planning to make
    them fully stable in version 0.17. Please submit any feedback you may have
    `here <https://github.com/pytorch/vision/issues/6753>`_.

.. _transforms_perf:

Performance considerations
--------------------------

We recommend the following guidelines to get the best performance out of the
transforms:

- Rely on the v2 transforms from ``torchvision.transforms.v2``
- Use tensors instead of PIL images
- Use ``torch.uint8`` dtype, especially for resizing
- Resize with bilinear or bicubic mode

This is what a typical transform pipeline could look like:

.. code:: python

    from torchvision.transforms import v2
    transforms = v2.Compose([
        v2.ToImage(),  # Convert to tensor, only needed if you had a PIL image
        v2.ToDtype(torch.uint8, scale=True),  # optional, most input are already uint8 at this point
        # ...
        v2.RandomResizedCrop(size=(224, 224), antialias=True),  # Or Resize(antialias=True)
        # ...
        v2.ToDtype(torch.float32, scale=True),  # Normalize expects float input
        v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])

The above should give you the best performance in a typical training environment
that relies on the :class:`torch.utils.data.DataLoader` with ``num_workers >
0``.

168
Transforms tend to be sensitive to the input strides / memory format. Some
Nicolas Hug's avatar
Nicolas Hug committed
169
transforms will be faster with channels-first images while others prefer
170
171
172
173
174
channels-last. Like ``torch`` operators, most transforms will preserve the
memory format of the input, but this may not always be respected due to
implementation details. You may want to experiment a bit if you're chasing the
very best performance.  Using :func:`torch.compile` on individual transforms may
also help factoring out the memory format variable (e.g. on
Nicolas Hug's avatar
Nicolas Hug committed
175
:class:`~torchvision.transforms.v2.Normalize`). Note that we're talking about
176
**memory format**, not :ref:`tensor shape <conventions>`.
Nicolas Hug's avatar
Nicolas Hug committed
177
178
179
180
181
182
183

Note that resize transforms like :class:`~torchvision.transforms.v2.Resize`
and :class:`~torchvision.transforms.v2.RandomResizedCrop` typically prefer
channels-last input and tend **not** to benefit from :func:`torch.compile` at
this time.

.. _functional_transforms:
184

Nicolas Hug's avatar
Nicolas Hug committed
185
186
Transform classes, functionals, and kernels
-------------------------------------------
187

Nicolas Hug's avatar
Nicolas Hug committed
188
189
190
191
192
193
Transforms are available as classes like
:class:`~torchvision.transforms.v2.Resize`, but also as functionals like
:func:`~torchvision.transforms.v2.functional.resize` in the
``torchvision.transforms.v2.functional`` namespace.
This is very much like the :mod:`torch.nn` package which defines both classes
and functional equivalents in :mod:`torch.nn.functional`.
194

195
The functionals support PIL images, pure tensors, or :ref:`TVTensors
Nicolas Hug's avatar
Nicolas Hug committed
196
<tv_tensors>`, e.g. both ``resize(image_tensor)`` and ``resize(boxes)`` are
Nicolas Hug's avatar
Nicolas Hug committed
197
valid.
198

Nicolas Hug's avatar
Nicolas Hug committed
199
200
201
202
203
204
205
206
.. note::

    Random transforms like :class:`~torchvision.transforms.v2.RandomCrop` will
    randomly sample some parameter each time they're called. Their functional
    counterpart (:func:`~torchvision.transforms.v2.functional.crop`) does not do
    any kind of random sampling and thus have a slighlty different
    parametrization. The ``get_params()`` class method of the transforms class
    can be used to perform parameter sampling when using the functional APIs.
207
208


Nicolas Hug's avatar
Nicolas Hug committed
209
210
211
212
213
214
215
216
217
218
The ``torchvision.transforms.v2.functional`` namespace also contains what we
call the "kernels". These are the low-level functions that implement the
core functionalities for specific types, e.g. ``resize_bounding_boxes`` or
```resized_crop_mask``. They are public, although not documented. Check the
`code
<https://github.com/pytorch/vision/blob/main/torchvision/transforms/v2/functional/__init__.py>`_
to see which ones are available (note that those starting with a leading
underscore are **not** public!). Kernels are only really useful if you want
:ref:`torchscript support <transforms_torchscript>` for types like bounding
boxes or masks.
219

Nicolas Hug's avatar
Nicolas Hug committed
220
.. _transforms_torchscript:
221

Nicolas Hug's avatar
Nicolas Hug committed
222
223
Torchscript support
-------------------
224

Nicolas Hug's avatar
Nicolas Hug committed
225
Most transform classes and functionals support torchscript. For composing
Nicolas Hug's avatar
Nicolas Hug committed
226
227
transforms, use :class:`torch.nn.Sequential` instead of
:class:`~torchvision.transforms.v2.Compose`:
228
229
230
231

.. code:: python

    transforms = torch.nn.Sequential(
Nicolas Hug's avatar
Nicolas Hug committed
232
233
        CenterCrop(10),
        Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
234
235
236
    )
    scripted_transforms = torch.jit.script(transforms)

Nicolas Hug's avatar
Nicolas Hug committed
237
238
239
240
241
242
243
244
.. warning::

    v2 transforms support torchscript, but if you call ``torch.jit.script()`` on
    a v2 **class** transform, you'll actually end up with its (scripted) v1
    equivalent.  This may lead to slightly different results between the
    scripted and eager executions due to implementation differences between v1
    and v2.

Nicolas Hug's avatar
Nicolas Hug committed
245
    If you really need torchscript support for the v2 transforms, we recommend
Nicolas Hug's avatar
Nicolas Hug committed
246
247
248
249
250
251
252
253
    scripting the **functionals** from the
    ``torchvision.transforms.v2.functional`` namespace to avoid surprises.


Also note that the functionals only support torchscript for pure tensors, which
are always treated as images. If you need torchscript support for other types
like bounding boxes or masks, you can rely on the :ref:`low-level kernels
<functional_transforms>`.
254

Nicolas Hug's avatar
Nicolas Hug committed
255
256
257
258
For any custom transformations to be used with ``torch.jit.script``, they should
be derived from ``torch.nn.Module``.

See also: :ref:`sphx_glr_auto_examples_others_plot_scripted_tensor_transforms.py`.
259

Nicolas Hug's avatar
Nicolas Hug committed
260
261
.. _v2_api_ref:

Nicolas Hug's avatar
Nicolas Hug committed
262
263
V2 API reference - Recommended
------------------------------
264

265
Geometry
Nicolas Hug's avatar
Nicolas Hug committed
266
267
268
269
^^^^^^^^

Resizing
""""""""
270

271
272
273
274
.. autosummary::
    :toctree: generated/
    :template: class.rst

275
    v2.Resize
276
277
278
    v2.ScaleJitter
    v2.RandomShortestSize
    v2.RandomResize
Nicolas Hug's avatar
Nicolas Hug committed
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294

Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.resize

Cropping
""""""""

.. autosummary::
    :toctree: generated/
    :template: class.rst

295
296
    v2.RandomCrop
    v2.RandomResizedCrop
297
    v2.RandomIoUCrop
298
299
300
    v2.CenterCrop
    v2.FiveCrop
    v2.TenCrop
Nicolas Hug's avatar
Nicolas Hug committed
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322

Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.crop
    v2.functional.resized_crop
    v2.functional.ten_crop
    v2.functional.center_crop
    v2.functional.five_crop

Others
""""""

.. autosummary::
    :toctree: generated/
    :template: class.rst

    v2.RandomHorizontalFlip
    v2.RandomVerticalFlip
323
    v2.Pad
324
325
    v2.RandomZoomOut
    v2.RandomRotation
326
327
    v2.RandomAffine
    v2.RandomPerspective
328
    v2.ElasticTransform
329

Nicolas Hug's avatar
Nicolas Hug committed
330
331
332
333
334
335
336
337
338
339
340
341
342
Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.horizontal_flip
    v2.functional.vertical_flip
    v2.functional.pad
    v2.functional.rotate
    v2.functional.affine
    v2.functional.perspective
    v2.functional.elastic
343

344
Color
Nicolas Hug's avatar
Nicolas Hug committed
345
^^^^^
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
346

347
348
349
350
.. autosummary::
    :toctree: generated/
    :template: class.rst

351
    v2.ColorJitter
352
    v2.RandomChannelPermutation
353
    v2.RandomPhotometricDistort
354
355
356
357
358
359
360
361
362
    v2.Grayscale
    v2.RandomGrayscale
    v2.GaussianBlur
    v2.RandomInvert
    v2.RandomPosterize
    v2.RandomSolarize
    v2.RandomAdjustSharpness
    v2.RandomAutocontrast
    v2.RandomEqualize
363

Nicolas Hug's avatar
Nicolas Hug committed
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.permute_channels
    v2.functional.rgb_to_grayscale
    v2.functional.to_grayscale
    v2.functional.gaussian_blur
    v2.functional.invert
    v2.functional.posterize
    v2.functional.solarize
    v2.functional.adjust_sharpness
    v2.functional.autocontrast
    v2.functional.adjust_contrast
    v2.functional.equalize
    v2.functional.adjust_brightness
    v2.functional.adjust_saturation
    v2.functional.adjust_hue
    v2.functional.adjust_gamma


387
Composition
Nicolas Hug's avatar
Nicolas Hug committed
388
^^^^^^^^^^^
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
389

390
391
392
.. autosummary::
    :toctree: generated/
    :template: class.rst
vfdev's avatar
vfdev committed
393

394
395
396
397
    v2.Compose
    v2.RandomApply
    v2.RandomChoice
    v2.RandomOrder
vfdev's avatar
vfdev committed
398

399
Miscellaneous
Nicolas Hug's avatar
Nicolas Hug committed
400
^^^^^^^^^^^^^
vfdev's avatar
vfdev committed
401

402
403
404
.. autosummary::
    :toctree: generated/
    :template: class.rst
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
405

406
407
408
409
    v2.LinearTransformation
    v2.Normalize
    v2.RandomErasing
    v2.Lambda
410
411
    v2.SanitizeBoundingBoxes
    v2.ClampBoundingBoxes
412
    v2.UniformTemporalSubsample
vfdev's avatar
vfdev committed
413

Nicolas Hug's avatar
Nicolas Hug committed
414
415
416
417
418
419
420
421
422
423
424
Functionals

.. autosummary::
    :toctree: generated/
    :template: function.rst

    v2.functional.normalize
    v2.functional.erase
    v2.functional.clamp_bounding_boxes
    v2.functional.uniform_temporal_subsample

425
.. _conversion_transforms:
426

427
Conversion
Nicolas Hug's avatar
Nicolas Hug committed
428
^^^^^^^^^^
Sasank Chilamkurthy's avatar
Sasank Chilamkurthy committed
429

Nicolas Hug's avatar
Nicolas Hug committed
430
431
432
433
.. note::
    Beware, some of these conversion transforms below will scale the values
    while performing the conversion, while some may not do any scaling. By
    scaling, we mean e.g. that a ``uint8`` -> ``float32`` would map the [0,
Nicolas Hug's avatar
Nicolas Hug committed
434
435
    255] range into [0, 1] (and vice-versa). See :ref:`range_and_dtype`.

436
437
438
.. autosummary::
    :toctree: generated/
    :template: class.rst
vfdev's avatar
vfdev committed
439

440
    v2.ToImage
Nicolas Hug's avatar
Nicolas Hug committed
441
442
443
    v2.ToPureTensor
    v2.PILToTensor
    v2.ToPILImage
Nicolas Hug's avatar
Nicolas Hug committed
444
    v2.ToDtype
vfdev's avatar
vfdev committed
445
    v2.ConvertBoundingBoxFormat
Nicolas Hug's avatar
Nicolas Hug committed
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469

functionals

.. autosummary::
    :toctree: generated/
    :template: functional.rst

    v2.functional.to_image
    v2.functional.pil_to_tensor
    v2.functional.to_pil_image
    v2.functional.to_dtype
    v2.functional.convert_bounding_box_format


Deprecated

.. autosummary::
    :toctree: generated/
    :template: class.rst

    v2.ToTensor
    v2.functional.to_tensor
    v2.ConvertImageDtype
    v2.functional.convert_image_dtype
Nicolas Hug's avatar
Nicolas Hug committed
470

471
Auto-Augmentation
Nicolas Hug's avatar
Nicolas Hug committed
472
^^^^^^^^^^^^^^^^^
473
474
475
476
477
478
479

`AutoAugment <https://arxiv.org/pdf/1805.09501.pdf>`_ is a common Data Augmentation technique that can improve the accuracy of Image Classification models.
Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that
ImageNet policies provide significant improvements when applied to other datasets.
In TorchVision we implemented 3 policies learned on the following datasets: ImageNet, CIFAR10 and SVHN.
The new transform can be used standalone or mixed-and-matched with existing transforms:

480
481
482
.. autosummary::
    :toctree: generated/
    :template: class.rst
483

484
485
486
487
    v2.AutoAugment
    v2.RandAugment
    v2.TrivialAugmentWide
    v2.AugMix
488

Nicolas Hug's avatar
Nicolas Hug committed
489

490
CutMix - MixUp
Nicolas Hug's avatar
Nicolas Hug committed
491
^^^^^^^^^^^^^^
492

493
CutMix and MixUp are special transforms that
494
are meant to be used on batches rather than on individual images, because they
495
496
are combining pairs of images together. These can be used after the dataloader
(once the samples are batched), or part of a collation function. See
Nicolas Hug's avatar
Nicolas Hug committed
497
:ref:`sphx_glr_auto_examples_transforms_plot_cutmix_mixup.py` for detailed usage examples.
498
499
500
501
502

.. autosummary::
    :toctree: generated/
    :template: class.rst

Nicolas Hug's avatar
Nicolas Hug committed
503
504
    v2.CutMix
    v2.MixUp
505

Nicolas Hug's avatar
Nicolas Hug committed
506
507
Developer tools
^^^^^^^^^^^^^^^
508

Nicolas Hug's avatar
Nicolas Hug committed
509
510
511
.. autosummary::
    :toctree: generated/
    :template: function.rst
512

Nicolas Hug's avatar
Nicolas Hug committed
513
    v2.functional.register_kernel
514

515

Nicolas Hug's avatar
Nicolas Hug committed
516
517
V1 API Reference
----------------
518

Nicolas Hug's avatar
Nicolas Hug committed
519
520
Geometry
^^^^^^^^
521

Nicolas Hug's avatar
Nicolas Hug committed
522
523
524
.. autosummary::
    :toctree: generated/
    :template: class.rst
525

Nicolas Hug's avatar
Nicolas Hug committed
526
527
528
529
530
531
532
533
534
535
536
537
538
    Resize
    RandomCrop
    RandomResizedCrop
    CenterCrop
    FiveCrop
    TenCrop
    Pad
    RandomRotation
    RandomAffine
    RandomPerspective
    ElasticTransform
    RandomHorizontalFlip
    RandomVerticalFlip
539
540


Nicolas Hug's avatar
Nicolas Hug committed
541
542
Color
^^^^^
543

Nicolas Hug's avatar
Nicolas Hug committed
544
545
546
.. autosummary::
    :toctree: generated/
    :template: class.rst
547

Nicolas Hug's avatar
Nicolas Hug committed
548
549
550
551
552
553
554
555
556
557
    ColorJitter
    Grayscale
    RandomGrayscale
    GaussianBlur
    RandomInvert
    RandomPosterize
    RandomSolarize
    RandomAdjustSharpness
    RandomAutocontrast
    RandomEqualize
558

Nicolas Hug's avatar
Nicolas Hug committed
559
560
Composition
^^^^^^^^^^^
561

Nicolas Hug's avatar
Nicolas Hug committed
562
563
564
.. autosummary::
    :toctree: generated/
    :template: class.rst
565

Nicolas Hug's avatar
Nicolas Hug committed
566
567
568
569
    Compose
    RandomApply
    RandomChoice
    RandomOrder
570

Nicolas Hug's avatar
Nicolas Hug committed
571
572
Miscellaneous
^^^^^^^^^^^^^
573

Nicolas Hug's avatar
Nicolas Hug committed
574
575
576
.. autosummary::
    :toctree: generated/
    :template: class.rst
577

Nicolas Hug's avatar
Nicolas Hug committed
578
579
580
581
    LinearTransformation
    Normalize
    RandomErasing
    Lambda
582

Nicolas Hug's avatar
Nicolas Hug committed
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
Conversion
^^^^^^^^^^

.. note::
    Beware, some of these conversion transforms below will scale the values
    while performing the conversion, while some may not do any scaling. By
    scaling, we mean e.g. that a ``uint8`` -> ``float32`` would map the [0,
    255] range into [0, 1] (and vice-versa). See :ref:`range_and_dtype`.
    
.. autosummary::
    :toctree: generated/
    :template: class.rst

    ToPILImage
    ToTensor
    PILToTensor
    ConvertImageDtype

Auto-Augmentation
^^^^^^^^^^^^^^^^^

`AutoAugment <https://arxiv.org/pdf/1805.09501.pdf>`_ is a common Data Augmentation technique that can improve the accuracy of Image Classification models.
Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that
ImageNet policies provide significant improvements when applied to other datasets.
In TorchVision we implemented 3 policies learned on the following datasets: ImageNet, CIFAR10 and SVHN.
The new transform can be used standalone or mixed-and-matched with existing transforms:

.. autosummary::
    :toctree: generated/
    :template: class.rst

    AutoAugmentPolicy
    AutoAugment
    RandAugment
    TrivialAugmentWide
    AugMix



Functional Transforms
^^^^^^^^^^^^^^^^^^^^^

.. currentmodule:: torchvision.transforms.functional
626

627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
.. autosummary::
    :toctree: generated/
    :template: function.rst

    adjust_brightness
    adjust_contrast
    adjust_gamma
    adjust_hue
    adjust_saturation
    adjust_sharpness
    affine
    autocontrast
    center_crop
    convert_image_dtype
    crop
    equalize
    erase
    five_crop
    gaussian_blur
646
    get_dimensions
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
    get_image_num_channels
    get_image_size
    hflip
    invert
    normalize
    pad
    perspective
    pil_to_tensor
    posterize
    resize
    resized_crop
    rgb_to_grayscale
    rotate
    solarize
    ten_crop
    to_grayscale
    to_pil_image
    to_tensor
    vflip