Unverified Commit 23a67b3f authored by Nicolas Hug's avatar Nicolas Hug Committed by GitHub
Browse files

Updates to RoI transforms docs (#3645)

* Edited roi transforms docs

* remove incorrect param desc

* Fixed confusion  about feature maps according to comment
parent a89da92b
...@@ -19,23 +19,24 @@ def ps_roi_align( ...@@ -19,23 +19,24 @@ def ps_roi_align(
mentioned in Light-Head R-CNN. mentioned in Light-Head R-CNN.
Args: Args:
input (Tensor[N, C, H, W]): input tensor input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2) boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from. format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``. The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed, If a single Tensor is passed, then the first column should
then the first column should contain the batch index. If a list of Tensors contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
is passed, then each Tensor will correspond to the boxes for an element i If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in a batch in the batch.
output_size (int or Tuple[int, int]): the size of the output after the cropping output_size (int or Tuple[int, int]): the size of the output (in bins or pixels) after the pooling
is performed, as (height, width) is performed, as (height, width).
spatial_scale (float): a scaling factor that maps the input coordinates to spatial_scale (float): a scaling factor that maps the input coordinates to
the box coordinates. Default: 1.0 the box coordinates. Default: 1.0
sampling_ratio (int): number of sampling points in the interpolation grid sampling_ratio (int): number of sampling points in the interpolation grid
used to compute the output value of each pooled output bin. If > 0 used to compute the output value of each pooled output bin. If > 0,
then exactly sampling_ratio x sampling_ratio grid points are used. then exactly ``sampling_ratio x sampling_ratio`` sampling points per bin are used. If
If <= 0, then an adaptive number of grid points are used (computed as <= 0, then an adaptive number of grid points are used (computed as
ceil(roi_width / pooled_w), and likewise for height). Default: -1 ``ceil(roi_width / output_width)``, and likewise for height). Default: -1
Returns: Returns:
Tensor[K, C, output_size[0], output_size[1]]: The pooled RoIs Tensor[K, C, output_size[0], output_size[1]]: The pooled RoIs
......
...@@ -18,16 +18,17 @@ def ps_roi_pool( ...@@ -18,16 +18,17 @@ def ps_roi_pool(
described in R-FCN described in R-FCN
Args: Args:
input (Tensor[N, C, H, W]): input tensor input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2) boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from. format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``. The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed, If a single Tensor is passed, then the first column should
then the first column should contain the batch index. If a list of Tensors contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
is passed, then each Tensor will correspond to the boxes for an element i If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in a batch in the batch.
output_size (int or Tuple[int, int]): the size of the output after the cropping output_size (int or Tuple[int, int]): the size of the output (in bins or pixels) after the pooling
is performed, as (height, width) is performed, as (height, width).
spatial_scale (float): a scaling factor that maps the input coordinates to spatial_scale (float): a scaling factor that maps the input coordinates to
the box coordinates. Default: 1.0 the box coordinates. Default: 1.0
......
...@@ -17,30 +17,31 @@ def roi_align( ...@@ -17,30 +17,31 @@ def roi_align(
aligned: bool = False, aligned: bool = False,
) -> Tensor: ) -> Tensor:
""" """
Performs Region of Interest (RoI) Align operator described in Mask R-CNN Performs Region of Interest (RoI) Align operator with average pooling, as described in Mask R-CNN.
Args: Args:
input (Tensor[N, C, H, W]): input tensor input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
If the tensor is quantized, we expect a batch size of ``N == 1``. If the tensor is quantized, we expect a batch size of ``N == 1``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2) boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from. format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``. The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed, If a single Tensor is passed, then the first column should
then the first column should contain the batch index. If a list of Tensors contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
is passed, then each Tensor will correspond to the boxes for an element i If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in a batch in the batch.
output_size (int or Tuple[int, int]): the size of the output after the cropping output_size (int or Tuple[int, int]): the size of the output (in bins or pixels) after the pooling
is performed, as (height, width) is performed, as (height, width).
spatial_scale (float): a scaling factor that maps the input coordinates to spatial_scale (float): a scaling factor that maps the input coordinates to
the box coordinates. Default: 1.0 the box coordinates. Default: 1.0
sampling_ratio (int): number of sampling points in the interpolation grid sampling_ratio (int): number of sampling points in the interpolation grid
used to compute the output value of each pooled output bin. If > 0, used to compute the output value of each pooled output bin. If > 0,
then exactly sampling_ratio x sampling_ratio grid points are used. If then exactly ``sampling_ratio x sampling_ratio`` sampling points per bin are used. If
<= 0, then an adaptive number of grid points are used (computed as <= 0, then an adaptive number of grid points are used (computed as
ceil(roi_width / pooled_w), and likewise for height). Default: -1 ``ceil(roi_width / output_width)``, and likewise for height). Default: -1
aligned (bool): If False, use the legacy implementation. aligned (bool): If False, use the legacy implementation.
If True, pixel shift it by -0.5 for align more perfectly about two neighboring pixel indices. If True, pixel shift the box coordinates it by -0.5 for a better alignment with the two
This version in Detectron2 neighboring pixel indices. This version is used in Detectron2
Returns: Returns:
Tensor[K, C, output_size[0], output_size[1]]: The pooled RoIs. Tensor[K, C, output_size[0], output_size[1]]: The pooled RoIs.
......
...@@ -18,14 +18,15 @@ def roi_pool( ...@@ -18,14 +18,15 @@ def roi_pool(
Performs Region of Interest (RoI) Pool operator described in Fast R-CNN Performs Region of Interest (RoI) Pool operator described in Fast R-CNN
Args: Args:
input (Tensor[N, C, H, W]): input tensor input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2) boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from. format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``. The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed, If a single Tensor is passed, then the first column should
then the first column should contain the batch index. If a list of Tensors contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
is passed, then each Tensor will correspond to the boxes for an element i If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in a batch in the batch.
output_size (int or Tuple[int, int]): the size of the output after the cropping output_size (int or Tuple[int, int]): the size of the output after the cropping
is performed, as (height, width) is performed, as (height, width)
spatial_scale (float): a scaling factor that maps the input coordinates to spatial_scale (float): a scaling factor that maps the input coordinates to
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment