Minor updates in autoaugment, augment docstring v2 (#7317)

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>

Minor updates in autoaugment, augment docstring v2 (#7317)
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
74e33657 · vfdev · GitHub · c4b41939 · 74e33657 · 74e33657
Unverified Commit 74e33657 authored Feb 24, 2023 by vfdev Committed by GitHub Feb 24, 2023
Show whitespace changes
Inline Side-by-side

Showing with 32 additions and 24 deletions

torchvision/transforms/v2/_augment.py torchvision/transforms/v2/_augment.py +6 -6

torchvision/transforms/v2/_auto_augment.py torchvision/transforms/v2/_auto_augment.py +26 -18

No files found.
--- a/torchvision/transforms/v2/_augment.py
+++ b/torchvision/transforms/v2/_augment.py
@@ -13,7 +13,7 @@ from .utils import is_simple_tensor, query_chw


 class RandomErasing(_RandomApplyTransform):
-    """[BETA] Randomly selects a rectangle region in the input image or video and erases its pixels.
+    """[BETA] Randomly select a rectangle region in the input image or video and erase its pixels.

    .. betastatus:: RandomErasing transform

@@ -21,14 +21,14 @@ class RandomErasing(_RandomApplyTransform):
    'Random Erasing Data Augmentation' by Zhong et al. See https://arxiv.org/abs/1708.04896

    Args:
-         p: probability that the random erasing operation will be performed.
-         scale: range of proportion of erased area against input image.
-         ratio: range of aspect ratio of erased area.
-         value: erasing value. Default is 0. If a single int, it is used to
+        p (float, optional): probability that the random erasing operation will be performed.
+        scale (tuple of float, optional): range of proportion of erased area against input image.
+        ratio (tuple of float, optional): range of aspect ratio of erased area.
+        value (number or tuple of numbers): erasing value. Default is 0. If a single int, it is used to
            erase all pixels. If a tuple of length 3, it is used to erase
            R, G, B channels respectively.
            If a str of 'random', erasing each pixel with random values.
-         inplace: boolean to make this transform inplace. Default set to False.
+        inplace (bool, optional): boolean to make this transform inplace. Default set to False.

    Returns:
        Erased input.

--- a/torchvision/transforms/v2/_auto_augment.py
+++ b/torchvision/transforms/v2/_auto_augment.py
@@ -167,14 +167,16 @@ class AutoAugment(_AutoAugmentBase):

    .. betastatus:: AutoAugment transform

-    If the image is torch Tensor, it should be of type torch.uint8, and it is expected
+    This transformation works on images and videos only.
+
+    If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
    to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
    If img is PIL Image, it is expected to be in mode "L" or "RGB".

    Args:
-        policy (AutoAugmentPolicy): Desired policy enum defined by
+        policy (AutoAugmentPolicy, optional): Desired policy enum defined by
            :class:`torchvision.transforms.autoaugment.AutoAugmentPolicy`. Default is ``AutoAugmentPolicy.IMAGENET``.
-        interpolation (InterpolationMode): Desired interpolation enum defined by
+        interpolation (InterpolationMode, optional): Desired interpolation enum defined by
            :class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
            If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
        fill (sequence or number, optional): Pixel fill value for the area outside the transformed
@@ -342,15 +344,17 @@ class RandAugment(_AutoAugmentBase):

    .. betastatus:: RandAugment transform

-    If the image is torch Tensor, it should be of type torch.uint8, and it is expected
+    This transformation works on images and videos only.
+
+    If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
    to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
    If img is PIL Image, it is expected to be in mode "L" or "RGB".

    Args:
-        num_ops (int): Number of augmentation transformations to apply sequentially.
-        magnitude (int): Magnitude for all the transformations.
-        num_magnitude_bins (int): The number of different magnitude values.
-        interpolation (InterpolationMode): Desired interpolation enum defined by
+        num_ops (int, optional): Number of augmentation transformations to apply sequentially.
+        magnitude (int, optional): Magnitude for all the transformations.
+        num_magnitude_bins (int, optional): The number of different magnitude values.
+        interpolation (InterpolationMode, optional): Desired interpolation enum defined by
            :class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
            If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
        fill (sequence or number, optional): Pixel fill value for the area outside the transformed
@@ -423,13 +427,15 @@ class TrivialAugmentWide(_AutoAugmentBase):

    .. betastatus:: TrivialAugmentWide transform

-    If the image is torch Tensor, it should be of type torch.uint8, and it is expected
+    This transformation works on images and videos only.
+
+    If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
    to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
    If img is PIL Image, it is expected to be in mode "L" or "RGB".

    Args:
-        num_magnitude_bins (int): The number of different magnitude values.
-        interpolation (InterpolationMode): Desired interpolation enum defined by
+        num_magnitude_bins (int, optional): The number of different magnitude values.
+        interpolation (InterpolationMode, optional): Desired interpolation enum defined by
            :class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
            If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
        fill (sequence or number, optional): Pixel fill value for the area outside the transformed
@@ -492,18 +498,20 @@ class AugMix(_AutoAugmentBase):

    .. betastatus:: AugMix transform

-    If the image is torch Tensor, it should be of type torch.uint8, and it is expected
+    This transformation works on images and videos only.
+
+    If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
    to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
    If img is PIL Image, it is expected to be in mode "L" or "RGB".

    Args:
-        severity (int): The severity of base augmentation operators. Default is ``3``.
-        mixture_width (int): The number of augmentation chains. Default is ``3``.
-        chain_depth (int): The depth of augmentation chains. A negative value denotes stochastic depth sampled from the interval [1, 3].
+        severity (int, optional): The severity of base augmentation operators. Default is ``3``.
+        mixture_width (int, optional): The number of augmentation chains. Default is ``3``.
+        chain_depth (int, optional): The depth of augmentation chains. A negative value denotes stochastic depth sampled from the interval [1, 3].
            Default is ``-1``.
-        alpha (float): The hyperparameter for the probability distributions. Default is ``1.0``.
-        all_ops (bool): Use all operations (including brightness, contrast, color and sharpness). Default is ``True``.
-        interpolation (InterpolationMode): Desired interpolation enum defined by
+        alpha (float, optional): The hyperparameter for the probability distributions. Default is ``1.0``.
+        all_ops (bool, optional): Use all operations (including brightness, contrast, color and sharpness). Default is ``True``.
+        interpolation (InterpolationMode, optional): Desired interpolation enum defined by
            :class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
            If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
        fill (sequence or number, optional): Pixel fill value for the area outside the transformed