"...text-generation-inference.git" did not exist on "d6af14b25b6c7335b8e9d241fa42cc4e132b8b35"
Unverified Commit 23a67b3f authored by Nicolas Hug's avatar Nicolas Hug Committed by GitHub
Browse files

Updates to RoI transforms docs (#3645)

* Edited roi transforms docs

* remove incorrect param desc

* Fixed confusion  about feature maps according to comment
parent a89da92b
......@@ -19,23 +19,24 @@ def ps_roi_align(
mentioned in Light-Head R-CNN.
Args:
input (Tensor[N, C, H, W]): input tensor
input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed,
then the first column should contain the batch index. If a list of Tensors
is passed, then each Tensor will correspond to the boxes for an element i
in a batch
output_size (int or Tuple[int, int]): the size of the output after the cropping
is performed, as (height, width)
If a single Tensor is passed, then the first column should
contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in the batch.
output_size (int or Tuple[int, int]): the size of the output (in bins or pixels) after the pooling
is performed, as (height, width).
spatial_scale (float): a scaling factor that maps the input coordinates to
the box coordinates. Default: 1.0
sampling_ratio (int): number of sampling points in the interpolation grid
used to compute the output value of each pooled output bin. If > 0
then exactly sampling_ratio x sampling_ratio grid points are used.
If <= 0, then an adaptive number of grid points are used (computed as
ceil(roi_width / pooled_w), and likewise for height). Default: -1
used to compute the output value of each pooled output bin. If > 0,
then exactly ``sampling_ratio x sampling_ratio`` sampling points per bin are used. If
<= 0, then an adaptive number of grid points are used (computed as
``ceil(roi_width / output_width)``, and likewise for height). Default: -1
Returns:
Tensor[K, C, output_size[0], output_size[1]]: The pooled RoIs
......
......@@ -18,16 +18,17 @@ def ps_roi_pool(
described in R-FCN
Args:
input (Tensor[N, C, H, W]): input tensor
input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed,
then the first column should contain the batch index. If a list of Tensors
is passed, then each Tensor will correspond to the boxes for an element i
in a batch
output_size (int or Tuple[int, int]): the size of the output after the cropping
is performed, as (height, width)
If a single Tensor is passed, then the first column should
contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in the batch.
output_size (int or Tuple[int, int]): the size of the output (in bins or pixels) after the pooling
is performed, as (height, width).
spatial_scale (float): a scaling factor that maps the input coordinates to
the box coordinates. Default: 1.0
......
......@@ -17,30 +17,31 @@ def roi_align(
aligned: bool = False,
) -> Tensor:
"""
Performs Region of Interest (RoI) Align operator described in Mask R-CNN
Performs Region of Interest (RoI) Align operator with average pooling, as described in Mask R-CNN.
Args:
input (Tensor[N, C, H, W]): input tensor
input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
If the tensor is quantized, we expect a batch size of ``N == 1``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed,
then the first column should contain the batch index. If a list of Tensors
is passed, then each Tensor will correspond to the boxes for an element i
in a batch
output_size (int or Tuple[int, int]): the size of the output after the cropping
is performed, as (height, width)
If a single Tensor is passed, then the first column should
contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in the batch.
output_size (int or Tuple[int, int]): the size of the output (in bins or pixels) after the pooling
is performed, as (height, width).
spatial_scale (float): a scaling factor that maps the input coordinates to
the box coordinates. Default: 1.0
sampling_ratio (int): number of sampling points in the interpolation grid
used to compute the output value of each pooled output bin. If > 0,
then exactly sampling_ratio x sampling_ratio grid points are used. If
then exactly ``sampling_ratio x sampling_ratio`` sampling points per bin are used. If
<= 0, then an adaptive number of grid points are used (computed as
ceil(roi_width / pooled_w), and likewise for height). Default: -1
``ceil(roi_width / output_width)``, and likewise for height). Default: -1
aligned (bool): If False, use the legacy implementation.
If True, pixel shift it by -0.5 for align more perfectly about two neighboring pixel indices.
This version in Detectron2
If True, pixel shift the box coordinates it by -0.5 for a better alignment with the two
neighboring pixel indices. This version is used in Detectron2
Returns:
Tensor[K, C, output_size[0], output_size[1]]: The pooled RoIs.
......
......@@ -18,14 +18,15 @@ def roi_pool(
Performs Region of Interest (RoI) Pool operator described in Fast R-CNN
Args:
input (Tensor[N, C, H, W]): input tensor
input (Tensor[N, C, H, W]): The input tensor, i.e. a batch with ``N`` elements. Each element
contains ``C`` feature maps of dimensions ``H x W``.
boxes (Tensor[K, 5] or List[Tensor[L, 4]]): the box coordinates in (x1, y1, x2, y2)
format where the regions will be taken from.
The coordinate must satisfy ``0 <= x1 < x2`` and ``0 <= y1 < y2``.
If a single Tensor is passed,
then the first column should contain the batch index. If a list of Tensors
is passed, then each Tensor will correspond to the boxes for an element i
in a batch
If a single Tensor is passed, then the first column should
contain the index of the corresponding element in the batch, i.e. a number in ``[0, N - 1]``.
If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i
in the batch.
output_size (int or Tuple[int, int]): the size of the output after the cropping
is performed, as (height, width)
spatial_scale (float): a scaling factor that maps the input coordinates to
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment