"vscode:/vscode.git/clone" did not exist on "16fba4c095843821e544b15d17a610c5e2541bce"
Unverified Commit 947ae1dc authored by Nicolas Hug's avatar Nicolas Hug Committed by GitHub
Browse files

Update docstrings of detection models regarding resizing strategy (#8385)

parent c9eab681
...@@ -73,8 +73,12 @@ class FasterRCNN(GeneralizedRCNN): ...@@ -73,8 +73,12 @@ class FasterRCNN(GeneralizedRCNN):
The backbone should return a single Tensor or and OrderedDict[Tensor]. The backbone should return a single Tensor or and OrderedDict[Tensor].
num_classes (int): number of output classes of the model (including the background). num_classes (int): number of output classes of the model (including the background).
If box_predictor is specified, num_classes should be None. If box_predictor is specified, num_classes should be None.
min_size (int): minimum size of the image to be rescaled before feeding it to the backbone min_size (int): Images are rescaled before feeding them to the backbone:
max_size (int): maximum size of the image to be rescaled before feeding it to the backbone we attempt to preserve the aspect ratio and scale the shorter edge
to ``min_size``. If the resulting longer edge exceeds ``max_size``,
then downscale so that the longer edge does not exceed ``max_size``.
This may result in the shorter edge beeing lower than ``min_size``.
max_size (int): See ``min_size``.
image_mean (Tuple[float, float, float]): mean values used for input normalization. image_mean (Tuple[float, float, float]): mean values used for input normalization.
They are generally the mean values of the dataset on which the backbone has been trained They are generally the mean values of the dataset on which the backbone has been trained
on on
......
...@@ -299,8 +299,12 @@ class FCOS(nn.Module): ...@@ -299,8 +299,12 @@ class FCOS(nn.Module):
channels that each feature map has (and it should be the same for all feature maps). channels that each feature map has (and it should be the same for all feature maps).
The backbone should return a single Tensor or an OrderedDict[Tensor]. The backbone should return a single Tensor or an OrderedDict[Tensor].
num_classes (int): number of output classes of the model (including the background). num_classes (int): number of output classes of the model (including the background).
min_size (int): minimum size of the image to be rescaled before feeding it to the backbone min_size (int): Images are rescaled before feeding them to the backbone:
max_size (int): maximum size of the image to be rescaled before feeding it to the backbone we attempt to preserve the aspect ratio and scale the shorter edge
to ``min_size``. If the resulting longer edge exceeds ``max_size``,
then downscale so that the longer edge does not exceed ``max_size``.
This may result in the shorter edge beeing lower than ``min_size``.
max_size (int): See ``min_size``.
image_mean (Tuple[float, float, float]): mean values used for input normalization. image_mean (Tuple[float, float, float]): mean values used for input normalization.
They are generally the mean values of the dataset on which the backbone has been trained They are generally the mean values of the dataset on which the backbone has been trained
on on
......
...@@ -60,8 +60,12 @@ class KeypointRCNN(FasterRCNN): ...@@ -60,8 +60,12 @@ class KeypointRCNN(FasterRCNN):
The backbone should return a single Tensor or and OrderedDict[Tensor]. The backbone should return a single Tensor or and OrderedDict[Tensor].
num_classes (int): number of output classes of the model (including the background). num_classes (int): number of output classes of the model (including the background).
If box_predictor is specified, num_classes should be None. If box_predictor is specified, num_classes should be None.
min_size (int): minimum size of the image to be rescaled before feeding it to the backbone min_size (int): Images are rescaled before feeding them to the backbone:
max_size (int): maximum size of the image to be rescaled before feeding it to the backbone we attempt to preserve the aspect ratio and scale the shorter edge
to ``min_size``. If the resulting longer edge exceeds ``max_size``,
then downscale so that the longer edge does not exceed ``max_size``.
This may result in the shorter edge beeing lower than ``min_size``.
max_size (int): See ``min_size``.
image_mean (Tuple[float, float, float]): mean values used for input normalization. image_mean (Tuple[float, float, float]): mean values used for input normalization.
They are generally the mean values of the dataset on which the backbone has been trained They are generally the mean values of the dataset on which the backbone has been trained
on on
......
...@@ -61,8 +61,12 @@ class MaskRCNN(FasterRCNN): ...@@ -61,8 +61,12 @@ class MaskRCNN(FasterRCNN):
The backbone should return a single Tensor or and OrderedDict[Tensor]. The backbone should return a single Tensor or and OrderedDict[Tensor].
num_classes (int): number of output classes of the model (including the background). num_classes (int): number of output classes of the model (including the background).
If box_predictor is specified, num_classes should be None. If box_predictor is specified, num_classes should be None.
min_size (int): minimum size of the image to be rescaled before feeding it to the backbone min_size (int): Images are rescaled before feeding them to the backbone:
max_size (int): maximum size of the image to be rescaled before feeding it to the backbone we attempt to preserve the aspect ratio and scale the shorter edge
to ``min_size``. If the resulting longer edge exceeds ``max_size``,
then downscale so that the longer edge does not exceed ``max_size``.
This may result in the shorter edge beeing lower than ``min_size``.
max_size (int): See ``min_size``.
image_mean (Tuple[float, float, float]): mean values used for input normalization. image_mean (Tuple[float, float, float]): mean values used for input normalization.
They are generally the mean values of the dataset on which the backbone has been trained They are generally the mean values of the dataset on which the backbone has been trained
on on
......
...@@ -352,8 +352,12 @@ class RetinaNet(nn.Module): ...@@ -352,8 +352,12 @@ class RetinaNet(nn.Module):
channels that each feature map has (and it should be the same for all feature maps). channels that each feature map has (and it should be the same for all feature maps).
The backbone should return a single Tensor or an OrderedDict[Tensor]. The backbone should return a single Tensor or an OrderedDict[Tensor].
num_classes (int): number of output classes of the model (including the background). num_classes (int): number of output classes of the model (including the background).
min_size (int): minimum size of the image to be rescaled before feeding it to the backbone min_size (int): Images are rescaled before feeding them to the backbone:
max_size (int): maximum size of the image to be rescaled before feeding it to the backbone we attempt to preserve the aspect ratio and scale the shorter edge
to ``min_size``. If the resulting longer edge exceeds ``max_size``,
then downscale so that the longer edge does not exceed ``max_size``.
This may result in the shorter edge beeing lower than ``min_size``.
max_size (int): See ``min_size``.
image_mean (Tuple[float, float, float]): mean values used for input normalization. image_mean (Tuple[float, float, float]): mean values used for input normalization.
They are generally the mean values of the dataset on which the backbone has been trained They are generally the mean values of the dataset on which the backbone has been trained
on on
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment