Document ResNet architecture tweak (#5977)

* To resolve issue #5964 Add note for resnet architecture * Update resnet.py * Update resnet.py * Update resnet.rst * Fix stylings * Add the same notes on model builders * Improve description * Apply the change everywhere * Remove trailing space Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

Document ResNet architecture tweak (#5977)
* To resolve issue #5964 Add note for resnet architecture * Update resnet.py * Update resnet.py * Update resnet.rst * Fix stylings * Add the same notes on model builders * Improve description * Apply the change everywhere * Remove trailing space Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
37665a0b · puhuk · GitHub · d57f929d · 37665a0b · 37665a0b
Unverified Commit 37665a0b authored May 20, 2022 by puhuk Committed by GitHub May 20, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 23 additions and 0 deletions

docs/source/models/resnet.rst docs/source/models/resnet.rst +5 -0

torchvision/models/resnet.py torchvision/models/resnet.py +18 -0

No files found.
--- a/docs/source/models/resnet.rst
+++ b/docs/source/models/resnet.rst
@@ -6,6 +6,11 @@ ResNet
 The ResNet model is based on the `Deep Residual Learning for Image Recognition
 <https://arxiv.org/abs/1512.03385>`_ paper.
+.. note::
+    The bottleneck of TorchVision places the stride for downsampling to the second 3x3
+    convolution while the original paper places it to the first 1x1 convolution.
+    This variant improves the accuracy and is known as `ResNet V1.5
+    <https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch>`_.
 Model builders
 --------------

--- a/torchvision/models/resnet.py
+++ b/torchvision/models/resnet.py
@@ -699,6 +699,12 @@ def resnet34(*, weights: Optional[ResNet34_Weights] = None, progress: bool = Tru
 def resnet50(*, weights: Optional[ResNet50_Weights] = None, progress: bool = True, **kwargs: Any) -> ResNet:
    """ResNet-50 from `Deep Residual Learning for Image Recognition <https://arxiv.org/pdf/1512.03385.pdf>`__.
+    .. note::
+       The bottleneck of TorchVision places the stride for downsampling to the second 3x3
+       convolution while the original paper places it to the first 1x1 convolution.
+       This variant improves the accuracy and is known as `ResNet V1.5
+       <https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch>`_.
    Args:
        weights (:class:`~torchvision.models.ResNet50_Weights`, optional): The
            pretrained weights to use. See
@@ -724,6 +730,12 @@ def resnet50(*, weights: Optional[ResNet50_Weights] = None, progress: bool = Tru
 def resnet101(*, weights: Optional[ResNet101_Weights] = None, progress: bool = True, **kwargs: Any) -> ResNet:
    """ResNet-101 from `Deep Residual Learning for Image Recognition <https://arxiv.org/pdf/1512.03385.pdf>`__.
+    .. note::
+       The bottleneck of TorchVision places the stride for downsampling to the second 3x3
+       convolution while the original paper places it to the first 1x1 convolution.
+       This variant improves the accuracy and is known as `ResNet V1.5
+       <https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch>`_.
    Args:
        weights (:class:`~torchvision.models.ResNet101_Weights`, optional): The
            pretrained weights to use. See
@@ -749,6 +761,12 @@ def resnet101(*, weights: Optional[ResNet101_Weights] = None, progress: bool = T
 def resnet152(*, weights: Optional[ResNet152_Weights] = None, progress: bool = True, **kwargs: Any) -> ResNet:
    """ResNet-152 from `Deep Residual Learning for Image Recognition <https://arxiv.org/pdf/1512.03385.pdf>`__.
+    .. note::
+       The bottleneck of TorchVision places the stride for downsampling to the second 3x3
+       convolution while the original paper places it to the first 1x1 convolution.
+       This variant improves the accuracy and is known as `ResNet V1.5
+       <https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch>`_.
    Args:
        weights (:class:`~torchvision.models.ResNet152_Weights`, optional): The
            pretrained weights to use. See