Fix typos and grammar errors (#7065)

* fix typos throughout the code base * fix grammar * revert formatting changes to gallery * revert 'an uXX' * remove 'number of the best'

Fix typos and grammar errors (#7065)
* fix typos throughout the code base * fix grammar * revert formatting changes to gallery * revert 'an uXX' * remove 'number of the best'
7dc5e5bd · Philip Meier · GitHub · ed2a0adb · 7dc5e5bd · 7dc5e5bd
Unverified Commit 7dc5e5bd authored Jan 11, 2023 by Philip Meier Committed by GitHub Jan 11, 2023
20 changed files
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -69,7 +69,7 @@ If you plan to modify the code or documentation, please follow the steps below:
 For more details about pull requests, 
 please read [GitHub's guides](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request). 
-If you would like to contribute a new model, please see [here](#New-model).
+If you would like to contribute a new model, please see [here](#New-architecture-or-improved-model-weights).
 If you would like to contribute a new dataset, please see [here](#New-dataset). 
@@ -198,7 +198,7 @@ it in an issue as, most likely, it will not be accepted.
 ### Pull Request
 If all previous checks (flake8, mypy, unit tests) are passing, please send a PR. Submitted PR will pass other tests on 
-different operation systems, python versions and hardwares.
+different operating systems, python versions and hardware.
 For more details about pull requests workflow, 
 please read [GitHub's guides](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request).

--- a/CONTRIBUTING_MODELS.md
+++ b/CONTRIBUTING_MODELS.md
@@ -20,13 +20,13 @@ So, before starting any work and submitting a PR there are a few critical things
 ### 1. Preparation work
- Start by looking into this [issue](https://github.com/pytorch/vision/issues/2707) in order to have an idea of the models that are being considered, express your willingness to add a new model and discuss with the community whether or not this model should be included in TorchVision. It is very important at this stage to make sure that there is an agreement on the value of having this model in TorchVision and there is no one else already working on it.
+- Start by looking into this [issue](https://github.com/pytorch/vision/issues/2707) in order to have an idea of the models that are being considered, express your willingness to add a new model and discuss with the community whether this model should be included in TorchVision. It is very important at this stage to make sure that there is an agreement on the value of having this model in TorchVision and there is no one else already working on it.
 - If the decision is to include the new model, then please create a new ticket which will be used for all design and implementation discussions prior to the PR. One of the TorchVision maintainers will reach out at this stage and this will be your POC from this point onwards in order to provide support, guidance and regular feedback.
 ### 2.  Implement the model
-Please take a look at existing models in TorchVision to get familiar with the idioms. Also please look at recent contributions for new models. If in doubt about any design decisions you can ask for feedback on the issue created in step 1.  Example of things to take into account:
+Please take a look at existing models in TorchVision to get familiar with the idioms. Also, please look at recent contributions for new models. If in doubt about any design decisions you can ask for feedback on the issue created in step 1.  Example of things to take into account:
 - The implementation should be as close as possible to the canonical implementation/paper
 - The PR must include the code implementation, documentation and tests
@@ -34,7 +34,7 @@ Please take a look at existing models in TorchVision to get familiar with the id
 - The weights need to reproduce closely the results of the paper in terms of accuracy, even though the final weights to be deployed will be those trained by the TorchVision maintainers
 - The PR description should include commands/configuration used to train the model, so that the TorchVision maintainers can easily run them to verify the implementation and generate the final model to be released
 - Make sure we re-use existing components as much as possible (inheritance)
- New primitives (transforms, losses, etc) can be added if necessary, but the final location will be determined after discussion with the dedicated maintainer
+- New primitives (transforms, losses, etc.) can be added if necessary, but the final location will be determined after discussion with the dedicated maintainer
 - Please take a look at the detailed [implementation and documentation guidelines](https://github.com/pytorch/vision/issues/5319) for a fine grain list of things not to be missed
 ### 3. Train the model with reference scripts

--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -331,7 +331,7 @@ def inject_weight_metadata(app, what, name, obj, options, lines):
        ]
        if obj.__doc__ != "An enumeration.":
-            # We only show the custom enum doc if it was overriden. The default one from Python is "An enumeration"
+            # We only show the custom enum doc if it was overridden. The default one from Python is "An enumeration"
            lines.append("")
            lines.append(obj.__doc__)

--- a/docs/source/models/alexnet.rst
+++ b/docs/source/models/alexnet.rst
@@ -14,7 +14,7 @@ and is based on `One weird trick for parallelizing convolutional neural networks
 Model builders
 --------------
-The following model builders can be used to instanciate an AlexNet model, with or
+The following model builders can be used to instantiate an AlexNet model, with or
 without pre-trained weights. All the model builders internally rely on the
 ``torchvision.models.alexnet.AlexNet`` base class. Please refer to the `source
 code

--- a/docs/source/models/efficientnet.rst
+++ b/docs/source/models/efficientnet.rst
@@ -10,7 +10,7 @@ paper.
 Model builders
 --------------
-The following model builders can be used to instanciate an EfficientNet model, with or
+The following model builders can be used to instantiate an EfficientNet model, with or
 without pre-trained weights. All the model builders internally rely on the
 ``torchvision.models.efficientnet.EfficientNet`` base class. Please refer to the `source
 code

--- a/docs/source/models/efficientnetv2.rst
+++ b/docs/source/models/efficientnetv2.rst
@@ -10,7 +10,7 @@ paper.
 Model builders
 --------------
-The following model builders can be used to instanciate an EfficientNetV2 model, with or
+The following model builders can be used to instantiate an EfficientNetV2 model, with or
 without pre-trained weights. All the model builders internally rely on the
 ``torchvision.models.efficientnet.EfficientNet`` base class. Please refer to the `source
 code

--- a/docs/source/models/googlenet.rst
+++ b/docs/source/models/googlenet.rst
@@ -10,7 +10,7 @@ paper.
 Model builders
 --------------
-The following model builders can be used to instanciate a GoogLeNet model, with or
+The following model builders can be used to instantiate a GoogLeNet model, with or
 without pre-trained weights. All the model builders internally rely on the
 ``torchvision.models.googlenet.GoogLeNet`` base class. Please refer to the `source
 code

--- a/docs/source/models/googlenet_quant.rst
+++ b/docs/source/models/googlenet_quant.rst
@@ -10,7 +10,7 @@ paper.
 Model builders
 --------------
-The following model builders can be used to instanciate a quantized GoogLeNet
+The following model builders can be used to instantiate a quantized GoogLeNet
 model, with or without pre-trained weights. All the model builders internally
 rely on the ``torchvision.models.quantization.googlenet.QuantizableGoogLeNet``
 base class. Please refer to the `source code

--- a/docs/source/models/inception.rst
+++ b/docs/source/models/inception.rst
@@ -10,7 +10,7 @@ Computer Vision <https://arxiv.org/abs/1512.00567>`__ paper.
 Model builders
 --------------
-The following model builders can be used to instanciate an InceptionV3 model, with or
+The following model builders can be used to instantiate an InceptionV3 model, with or
 without pre-trained weights. All the model builders internally rely on the
 ``torchvision.models.inception.Inception3`` base class. Please refer to the `source
 code <https://github.com/pytorch/vision/blob/main/torchvision/models/inception.py>`_ for

--- a/docs/source/models/inception_quant.rst
+++ b/docs/source/models/inception_quant.rst
@@ -10,7 +10,7 @@ Computer Vision <https://arxiv.org/abs/1512.00567>`__ paper.
 Model builders
 --------------
-The following model builders can be used to instanciate a quantized Inception
+The following model builders can be used to instantiate a quantized Inception
 model, with or without pre-trained weights. All the model builders internally
 rely on the ``torchvision.models.quantization.inception.QuantizableInception3``
 base class. Please refer to the `source code

--- a/docs/source/models/mnasnet.rst
+++ b/docs/source/models/mnasnet.rst
@@ -11,7 +11,7 @@ Search for Mobile <https://arxiv.org/pdf/1807.11626.pdf>`__ paper.
 Model builders
 --------------
-The following model builders can be used to instanciate an MNASNet model.
+The following model builders can be used to instantiate an MNASNet model.
 All the model builders internally rely on the
 ``torchvision.models.mnasnet.MNASNet`` base class. Please refer to the `source
 code

--- a/docs/source/models/ssd.rst
+++ b/docs/source/models/ssd.rst
@@ -12,7 +12,7 @@ The SSD model is based on the `SSD: Single Shot MultiBox Detector
 Model builders
 --------------
-The following model builders can be used to instanciate a SSD model, with or
+The following model builders can be used to instantiate a SSD model, with or
 without pre-trained weights. All the model builders internally rely on the
 ``torchvision.models.detection.SSD`` base class. Please refer to the `source
 code

--- a/docs/source/utils.rst
+++ b/docs/source/utils.rst
@@ -4,7 +4,7 @@ Utils
 =====
 The ``torchvision.utils`` module contains various utilities, mostly :ref:`for
-vizualization <sphx_glr_auto_examples_plot_visualization_utils.py>`. 
+visualization <sphx_glr_auto_examples_plot_visualization_utils.py>`.
 .. currentmodule:: torchvision.utils

--- a/gallery/plot_scripted_tensor_transforms.py
+++ b/gallery/plot_scripted_tensor_transforms.py
@@ -10,7 +10,7 @@ them using JIT compilation.
 Prior to v0.8.0, transforms in torchvision have traditionally been PIL-centric
 and presented multiple limitations due to that. Now, since v0.8.0, transforms
-implementations are Tensor and PIL compatible and we can achieve the following
+implementations are Tensor and PIL compatible, and we can achieve the following
 new features:
 - transform multi-band torch tensor images (with more than 3-4 channels)

--- a/gallery/plot_visualization_utils.py
+++ b/gallery/plot_visualization_utils.py
@@ -188,7 +188,7 @@ show(dogs_with_masks)
 # We can plot more than one mask per image! Remember that the model returned as
 # many masks as there are classes. Let's ask the same query as above, but this
 # time for *all* classes, not just the dog class: "For each pixel and each class
-# C, is class C the most most likely class?"
+# C, is class C the most likely class?"
 #
 # This one is a bit more involved, so we'll first show how to do it with a
 # single image, and then we'll generalize to the batch
@@ -317,7 +317,7 @@ show(draw_segmentation_masks(dog1_int, dog1_bool_masks, alpha=0.9))
 #####################################
 # The model seems to have properly detected the dog, but it also confused trees
-# with people. Looking more closely at the scores will help us plotting more
+# with people. Looking more closely at the scores will help us plot more
 # relevant masks:
 print(dog1_output['scores'])
@@ -343,7 +343,7 @@ show(dogs_with_masks)
 #####################################
 # The two 'people' masks in the first image where not selected because they have
-# a lower score than the score threshold. Similarly in the second image, the
+# a lower score than the score threshold. Similarly, in the second image, the
 # instance with class 15 (which corresponds to 'bench') was not selected.
 #####################################

--- a/references/classification/README.md
+++ b/references/classification/README.md
@@ -298,7 +298,7 @@ Here `$MODEL` is one of `googlenet`, `inception_v3`, `resnet18`, `resnet50`, `re
 ### Quantized ShuffleNet V2
-Here are commands that we use to quantized the `shufflenet_v2_x1_5` and `shufflenet_v2_x2_0` models.
+Here are commands that we use to quantize the `shufflenet_v2_x1_5` and `shufflenet_v2_x2_0` models.
 ```
 # For shufflenet_v2_x1_5
 python train_quantization.py --device='cpu' --post-training-quantize --backend='fbgemm' \

--- a/references/classification/train.py
+++ b/references/classification/train.py
@@ -314,11 +314,11 @@ def main(args):
    model_ema = None
    if args.model_ema:
-        # Decay adjustment that aims to keep the decay independent from other hyper-parameters originally proposed at:
+        # Decay adjustment that aims to keep the decay independent of other hyper-parameters originally proposed at:
        # https://github.com/facebookresearch/pycls/blob/f8cd9627/pycls/core/net.py#L123
        #
        # total_ema_updates = (Dataset_size / n_GPUs) * epochs / (batch_size_per_gpu * EMA_steps)
-        # We consider constant = Dataset_size for a given dataset/setup and ommit it. Thus:
+        # We consider constant = Dataset_size for a given dataset/setup and omit it. Thus:
        # adjust = 1 / total_ema_updates ~= n_GPUs * batch_size_per_gpu * EMA_steps / epochs
        adjust = args.world_size * args.batch_size * args.model_ema_steps / args.epochs
        alpha = 1.0 - args.model_ema_decay

--- a/references/classification/utils.py
+++ b/references/classification/utils.py
@@ -365,12 +365,12 @@ def store_model_weights(model, checkpoint_path, checkpoint_key="model", strict=T
    checkpoint_path = os.path.abspath(checkpoint_path)
    output_dir = os.path.dirname(checkpoint_path)
-    # Deep copy to avoid side-effects on the model object.
+    # Deep copy to avoid side effects on the model object.
    model = copy.deepcopy(model)
    checkpoint = torch.load(checkpoint_path, map_location="cpu")
    # Load the weights to the model to validate that everything works
-    # and remove unnecessary weights (such as auxiliaries, etc)
+    # and remove unnecessary weights (such as auxiliaries, etc.)
    if checkpoint_key == "model_ema":
        del checkpoint[checkpoint_key]["n_averaged"]
        torch.nn.modules.utils.consume_prefix_in_state_dict_if_present(checkpoint[checkpoint_key], "module.")

--- a/references/depth/stereo/README.md
+++ b/references/depth/stereo/README.md
@@ -12,8 +12,8 @@ A ratio of **88-6-6** was used in order to train a baseline weight set. We provi
 Both used 8 A100 GPUs and a batch size of 2 (so effective batch size is 16). The
 rest of the hyper-parameters loosely follow the recipe from https://github.com/megvii-research/CREStereo.
 The original recipe trains for **300000** updates (or steps) on the dataset mixture. We modify the learning rate
-schedule to one that starts decaying the weight much sooner. Throughout experiments we found that this reduces overfitting
+schedule to one that starts decaying the weight much sooner. Throughout the experiments we found that this reduces 
-during evaluation time and gradient clip help stabilize the loss during a pre-mature learning rate change.
+overfitting during evaluation time and gradient clip help stabilize the loss during a pre-mature learning rate change.
 ```
 torchrun --nproc_per_node 8 --nnodes 1 train.py \
@@ -31,7 +31,7 @@ torchrun --nproc_per_node 8 --nnodes 1 train.py \
    --clip-grad-norm 1.0 \
 ```
-We employ a multi-set fine-tuning stage where we uniformly sample from multiple datasets. Given hat some of these datasets have extremely large images (``2048x2048`` or more) we opt for a very aggresive scale-range ``[0.2 - 0.8]`` such that as much of the original frame composition is captured inside the ``384x512`` crop.
+We employ a multi-set fine-tuning stage where we uniformly sample from multiple datasets. Given hat some of these datasets have extremely large images (``2048x2048`` or more) we opt for a very aggressive scale-range ``[0.2 - 0.8]`` such that as much of the original frame composition is captured inside the ``384x512`` crop.
 ```
 torchrun --nproc_per_node 8 --nnodes 1 train.py \
@@ -59,7 +59,7 @@ Evaluating the base weights
 torchrun --nproc_per_node 1 --nnodes 1 cascade_evaluation.py --dataset middlebury2014-train --batch-size 1 --dataset-root $dataset_root --model crestereo_base --weights CREStereo_Base_Weights.CRESTEREO_ETH_MBL_V1
 ```
-This should give an **mae of about 1.416** on the train set of `Middlebury2014`. Results may vary slightly depending on the batch size and the number of GPUs. For the most accurate resuts use 1 GPU and `--batch-size 1`. The created log file should look like this, where the first key is the number of cascades and the nested key is the number of recursive iterations:
+This should give an **mae of about 1.416** on the train set of `Middlebury2014`. Results may vary slightly depending on the batch size and the number of GPUs. For the most accurate results use 1 GPU and `--batch-size 1`. The created log file should look like this, where the first key is the number of cascades and the nested key is the number of recursive iterations:
 ```
 Dataset: middlebury2014-train @size: [384, 512]:
@@ -135,7 +135,7 @@ Dataset: middlebury2014-train @size: [384, 512]:
 # Concerns when training
-We encourage users to be aware of the **aspect-ratio** and **disparity scale** they are targetting when doing any sort of training or fine-tuning. The model is highly sensitive to these two factors, as a consequence with naive multi-set fine-tuning one can achieve `0.2 mae` relatively fast. We recommend that users pay close attention to how they **balance dataset sizing** when training such networks.
+We encourage users to be aware of the **aspect-ratio** and **disparity scale** they are targeting when doing any sort of training or fine-tuning. The model is highly sensitive to these two factors, as a consequence of naive multi-set fine-tuning one can achieve `0.2 mae` relatively fast. We recommend that users pay close attention to how they **balance dataset sizing** when training such networks.
 Ideally, dataset scaling should be trated at an individual level and a thorough **EDA** of the disparity distribution in random crops at the desired training / inference size should be performed prior to any large compute investments.
@@ -146,14 +146,14 @@ We encourage users to be aware of the **aspect-ratio** and **disparity scale** t
 ![Disparity1](assets/disparity-domain-drift.jpg)
-From left to right (`left_image`, `right_image`, `valid_mask`, `valid_mask & ground_truth`, `prediction`). **Darker is further away, lighter is closer**. In the case of `Sintel` which is more closely aligned to the original distribution of `CREStereo` we notice that the model accurately predicts the background scale whereas in the case of `Middlebury2014` it cannot correcly estimate the continous disparity. Notice that the frame composition is similar for both examples. The blue skybox in the `Sintel` scene behaves similarly to the `Middlebury` black background. However, because the `Middlebury` samples comes from an extremly large scene the crop size of `384x512` does not correctly capture the general training distribution.
+From left to right (`left_image`, `right_image`, `valid_mask`, `valid_mask & ground_truth`, `prediction`). **Darker is further away, lighter is closer**. In the case of `Sintel` which is more closely aligned to the original distribution of `CREStereo` we notice that the model accurately predicts the background scale whereas in the case of `Middlebury2014` it cannot correctly estimate the continuous disparity. Notice that the frame composition is similar for both examples. The blue skybox in the `Sintel` scene behaves similarly to the `Middlebury` black background. However, because the `Middlebury` samples comes from an extremely large scene the crop size of `384x512` does not correctly capture the general training distribution.
 ##### Sample B
-The top row contains a scene from `Sceneflow` using the `Monkaa` split whilst the bottom row is a scene from `Middlebury`. This sample exhibits the same issues when it comes to **background estimation**. Given the exagerated size of the `Middlebury` samples the model **colapses the smooth background** of the sample to what it considers to be a mean background disparity value.
+The top row contains a scene from `Sceneflow` using the `Monkaa` split whilst the bottom row is a scene from `Middlebury`. This sample exhibits the same issues when it comes to **background estimation**. Given the exaggerated size of the `Middlebury` samples the model **colapses the smooth background** of the sample to what it considers to be a mean background disparity value.
 ![Disparity2](assets/disparity-background-mode-collapse.jpg)

--- a/references/depth/stereo/cascade_evaluation.py
+++ b/references/depth/stereo/cascade_evaluation.py
@@ -9,7 +9,7 @@ from torch.nn import functional as F
 from train import make_eval_loader
 from utils.metrics import AVAILABLE_METRICS
-from vizualization import make_prediction_image_side_to_side
+from visualization import make_prediction_image_side_to_side
 def get_args_parser(add_help=True):
@@ -113,7 +113,7 @@ def _evaluate(
    *,
    padder_mode,
    print_freq=10,
-    writter=None,
+    writer=None,
    step=None,
    iterations=10,
    cascades=1,
@@ -180,10 +180,10 @@ def _evaluate(
            "the dataset is not divisible by the batch size. Try lowering the batch size for more accurate results."
        )
-    if writter is not None and args.rank == 0:
+    if writer is not None and args.rank == 0:
        for meter_name, meter_value in logger.meters.items():
            scalar_name = f"{meter_name} {header}"
-            writter.add_scalar(scalar_name, meter_value.avg, step)
+            writer.add_scalar(scalar_name, meter_value.avg, step)
    logger.synchronize_between_processes()
    print(header, logger)
@@ -192,7 +192,7 @@ def _evaluate(
    return logger_metrics
-def evaluate(model, loader, args, writter=None, step=None):
+def evaluate(model, loader, args, writer=None, step=None):
    os.makedirs(args.img_folder, exist_ok=True)
    checkpoint_name = os.path.basename(args.checkpoint) or args.weights
    image_checkpoint_folder = os.path.join(args.img_folder, checkpoint_name)
@@ -215,7 +215,7 @@ def evaluate(model, loader, args, writter=None, step=None):
                padder_mode=args.padder_type,
                header=f"{args.dataset} evaluation@ size:{args.eval_size} n_cascades:{n_cascades} n_iters:{n_iters}",
                batch_size=args.batch_size,
-                writter=writter,
+                writer=writer,
                step=step,
                iterations=n_iters,
                cascades=n_cascades,
@@ -271,7 +271,7 @@ def load_checkpoint(args):
            model = torchvision.prototype.models.depth.stereo.__dict__[args.model](weights=None)
            model.load_state_dict(checkpoint)
-        # set the appropiate devices
+        # set the appropriate devices
        if args.distributed and args.device == "cpu":
            raise ValueError("The device must be cuda if we want to run in distributed mode using torchrun")
        device = torch.device(args.device)