Unverified Commit 3a278d70 authored by Nicolas Hug's avatar Nicolas Hug Committed by GitHub
Browse files

Update to transforms docs (#3646)



* Fixed return docstrings

* Added some refs and corrected some parts

* more refs, and a note about dtypes
Co-authored-by: default avatarFrancisco Massa <fvsmassa@gmail.com>
parent 7f4ae8c6
...@@ -4,15 +4,34 @@ torchvision.transforms ...@@ -4,15 +4,34 @@ torchvision.transforms
.. currentmodule:: torchvision.transforms .. currentmodule:: torchvision.transforms
Transforms are common image transformations. They can be chained together using :class:`Compose`. Transforms are common image transformations. They can be chained together using :class:`Compose`.
Additionally, there is the :mod:`torchvision.transforms.functional` module. Most transform classes have a function equivalent: :ref:`functional
Functional transforms give fine-grained control over the transformations. transforms <functional_transforms>` give fine-grained control over the
transformations.
This is useful if you have to build a more complex transformation pipeline This is useful if you have to build a more complex transformation pipeline
(e.g. in the case of segmentation tasks). (e.g. in the case of segmentation tasks).
All transformations accept PIL Image, Tensor Image or batch of Tensor Images as input. Tensor Image is a tensor with Most transformations accept both `PIL <https://pillow.readthedocs.io>`_
``(C, H, W)`` shape, where ``C`` is a number of channels, ``H`` and ``W`` are image height and width. Batch of images and tensor images, although some transformations are :ref:`PIL-only
Tensor Images is a tensor of ``(B, C, H, W)`` shape, where ``B`` is a number of images in the batch. Deterministic or <transforms_pil_only>` and some are :ref:`tensor-only
random transformations applied on the batch of Tensor Images identically transform all the images of the batch. <transforms_tensor_only>`. The :ref:`conversion_transforms` may be used to
convert to and from PIL images.
The transformations that accept tensor images also accept batches of tensor
images. A Tensor Image is a tensor with ``(C, H, W)`` shape, where ``C`` is a
number of channels, ``H`` and ``W`` are image height and width. A batch of
Tensor Images is a tensor of ``(B, C, H, W)`` shape, where ``B`` is a number
of images in the batch.
The expected range of the values of a tensor image is implicitely defined by
the tensor dtype. Tensor images with a float dtype are expected to have
values in ``[0, 1)``. Tensor images with an integer dtype are expected to
have values in ``[0, MAX_DTYPE]`` where ``MAX_DTYPE`` is the largest value
that can be represented in that dtype.
Randomized transformations will apply the same transformation to all the
images of a given batch, but they will produce different transformations
across calls. For reproducible transformations across calls, you may use
:ref:`functional transforms <functional_transforms>`.
.. warning:: .. warning::
...@@ -117,6 +136,8 @@ Transforms on PIL Image and torch.\*Tensor ...@@ -117,6 +136,8 @@ Transforms on PIL Image and torch.\*Tensor
.. autoclass:: GaussianBlur .. autoclass:: GaussianBlur
:members: :members:
.. _transforms_pil_only:
Transforms on PIL Image only Transforms on PIL Image only
---------------------------- ----------------------------
...@@ -124,6 +145,7 @@ Transforms on PIL Image only ...@@ -124,6 +145,7 @@ Transforms on PIL Image only
.. autoclass:: RandomOrder .. autoclass:: RandomOrder
.. _transforms_tensor_only:
Transforms on torch.\*Tensor only Transforms on torch.\*Tensor only
--------------------------------- ---------------------------------
...@@ -139,6 +161,7 @@ Transforms on torch.\*Tensor only ...@@ -139,6 +161,7 @@ Transforms on torch.\*Tensor only
.. autoclass:: ConvertImageDtype .. autoclass:: ConvertImageDtype
.. _conversion_transforms:
Conversion Transforms Conversion Transforms
--------------------- ---------------------
...@@ -173,13 +196,16 @@ The new transform can be used standalone or mixed-and-matched with existing tran ...@@ -173,13 +196,16 @@ The new transform can be used standalone or mixed-and-matched with existing tran
:members: :members:
.. _functional_transforms:
Functional Transforms Functional Transforms
--------------------- ---------------------
Functional transforms give you fine-grained control of the transformation pipeline. Functional transforms give you fine-grained control of the transformation pipeline.
As opposed to the transformations above, functional transforms don't contain a random number As opposed to the transformations above, functional transforms don't contain a random number
generator for their parameters. generator for their parameters.
That means you have to specify/generate all parameters, but you can reuse the functional transform. That means you have to specify/generate all parameters, but the functional transform will give you
reproducible results across calls.
Example: Example:
you can apply a functional transform with the same parameters to multiple images like this: you can apply a functional transform with the same parameters to multiple images like this:
......
...@@ -1103,9 +1103,9 @@ def to_grayscale(img, num_output_channels=1): ...@@ -1103,9 +1103,9 @@ def to_grayscale(img, num_output_channels=1):
Returns: Returns:
PIL Image: Grayscale version of the image. PIL Image: Grayscale version of the image.
if num_output_channels = 1 : returned image is single channel
if num_output_channels = 3 : returned image is 3 channel with r = g = b - if num_output_channels = 1 : returned image is single channel
- if num_output_channels = 3 : returned image is 3 channel with r = g = b
""" """
if isinstance(img, Image.Image): if isinstance(img, Image.Image):
return F_pil.to_grayscale(img, num_output_channels) return F_pil.to_grayscale(img, num_output_channels)
...@@ -1128,9 +1128,9 @@ def rgb_to_grayscale(img: Tensor, num_output_channels: int = 1) -> Tensor: ...@@ -1128,9 +1128,9 @@ def rgb_to_grayscale(img: Tensor, num_output_channels: int = 1) -> Tensor:
Returns: Returns:
PIL Image or Tensor: Grayscale version of the image. PIL Image or Tensor: Grayscale version of the image.
if num_output_channels = 1 : returned image is single channel
if num_output_channels = 3 : returned image is 3 channel with r = g = b - if num_output_channels = 1 : returned image is single channel
- if num_output_channels = 3 : returned image is 3 channel with r = g = b
""" """
if not isinstance(img, torch.Tensor): if not isinstance(img, torch.Tensor):
return F_pil.to_grayscale(img, num_output_channels) return F_pil.to_grayscale(img, num_output_channels)
...@@ -1330,6 +1330,7 @@ def equalize(img: Tensor) -> Tensor: ...@@ -1330,6 +1330,7 @@ def equalize(img: Tensor) -> Tensor:
img (PIL Image or Tensor): Image on which equalize is applied. img (PIL Image or Tensor): Image on which equalize is applied.
If img is torch Tensor, it is expected to be in [..., 1 or 3, H, W] format, If img is torch Tensor, it is expected to be in [..., 1 or 3, H, W] format,
where ... means it can have an arbitrary number of leading dimensions. where ... means it can have an arbitrary number of leading dimensions.
The tensor dtype must be ``torch.uint8`` and values are expected to be in ``[0, 255]``.
If img is PIL Image, it is expected to be in mode "P", "L" or "RGB". If img is PIL Image, it is expected to be in mode "P", "L" or "RGB".
Returns: Returns:
......
...@@ -1464,6 +1464,7 @@ class Grayscale(torch.nn.Module): ...@@ -1464,6 +1464,7 @@ class Grayscale(torch.nn.Module):
Returns: Returns:
PIL Image: Grayscale version of the input. PIL Image: Grayscale version of the input.
- If ``num_output_channels == 1`` : returned image is single channel - If ``num_output_channels == 1`` : returned image is single channel
- If ``num_output_channels == 3`` : returned image is 3 channel with r == g == b - If ``num_output_channels == 3`` : returned image is 3 channel with r == g == b
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment