push v0.1.3 version commit bd2ea47

c732df65 · limm · 5b3792fc · c732df65 · c732df65 · c732df65
Commit c732df65 authored Jan 18, 2024 by limm
20 changed files
--- a/docs/notes/contributing.md
+++ b/docs/notes/contributing.md
+../../.github/CONTRIBUTING.md
\ No newline at end of file
--- a/docs/notes/index.rst
+++ b/docs/notes/index.rst
+Notes
+======================================
+
+.. toctree::
+   :maxdepth: 2
+
+   benchmarks
+   compatibility
+   contributing
+   changelog
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
+termcolor
+numpy
+tqdm
+docutils==0.16
+Sphinx==3.0.0
+recommonmark==0.6.0
+sphinx_rtd_theme
+mock
+matplotlib
+termcolor
+yacs
+tabulate
+cloudpickle
+Pillow==6.2.2
+future
+requests
+six
+git+git://github.com/facebookresearch/fvcore.git
+https://download.pytorch.org/whl/cpu/torch-1.5.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
+https://download.pytorch.org/whl/cpu/torchvision-0.6.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
--- a/docs/tutorials/README.md
+++ b/docs/tutorials/README.md
+# Read the docs:
+
+The latest documentation built from this directory is available at [detectron2.readthedocs.io](https://detectron2.readthedocs.io/).
+Documents in this directory are not meant to be read on github.
--- a/docs/tutorials/builtin_datasets.md
+++ b/docs/tutorials/builtin_datasets.md
+../../datasets/README.md
\ No newline at end of file
--- a/docs/tutorials/configs.md
+++ b/docs/tutorials/configs.md
+# Configs
+
+Detectron2 provides a key-value based config system that can be
+used to obtain standard, common behaviors.
+
+Detectron2's config system uses YAML and [yacs](https://github.com/rbgirshick/yacs).
+In addition to the [basic operations](../modules/config.html#detectron2.config.CfgNode)
+that access and update a config, we provide the following extra functionalities:
+
+1. The config can have `_BASE_: base.yaml` field, which will load a base config first.
+   Values in the base config will be overwritten in sub-configs, if there are any conflicts.
+   We provided several base configs for standard model architectures.
+2. We provide config versioning, for backward compatibility.
+   If your config file is versioned with a config line like `VERSION: 2`,
+   detectron2 will still recognize it even if we change some keys in the future.
+
+"Config" is a very limited abstraction.
+We do not expect all features in detectron2 to be available through configs.
+If you need something that's not available in the config space,
+please write code using detectron2's API.
+
+### Basic Usage
+
+Some basic usage of the `CfgNode` object is shown here. See more in [documentation](../modules/config.html#detectron2.config.CfgNode).
+```python
+from detectron2.config import get_cfg
+cfg = get_cfg()    # obtain detectron2's default config
+cfg.xxx = yyy      # add new configs for your own custom components
+cfg.merge_from_file("my_cfg.yaml")   # load values from a file
+
+cfg.merge_from_list(["MODEL.WEIGHTS", "weights.pth"])   # can also load values from a list of str
+print(cfg.dump())  # print formatted configs
+```
+
+Many builtin tools in detectron2 accepts command line config overwrite:
+Key-value pairs provided in the command line will overwrite the existing values in the config file.
+For example, [demo.py](../../demo/demo.py) can be used with
+```
+./demo.py --config-file config.yaml [--other-options] \
+  --opts MODEL.WEIGHTS /path/to/weights INPUT.MIN_SIZE_TEST 1000
+```
+
+To see a list of available configs in detectron2 and what they mean,
+check [Config References](../modules/config.html#config-references)
+
+
+### Best Practice with Configs
+
+1. Treat the configs you write as "code": avoid copying them or duplicating them; use `_BASE_`
+   to share common parts between configs.
+
+2. Keep the configs you write simple: don't include keys that do not affect the experimental setting.
+
+3. Keep a version number in your configs (or the base config), e.g., `VERSION: 2`,
+   for backward compatibility.
+	 We print a warning when reading a config without version number.
+   The official configs do not include version number because they are meant to
+   be always up-to-date.
--- a/docs/tutorials/data_loading.md
+++ b/docs/tutorials/data_loading.md
+
+# Use Custom Dataloaders
+
+## How the Existing Dataloader Works
+
+Detectron2 contains a builtin data loading pipeline.
+It's good to understand how it works, in case you need to write a custom one.
+
+Detectron2 provides two functions
+[build_detection_{train,test}_loader](../modules/data.html#detectron2.data.build_detection_train_loader)
+that create a default data loader from a given config.
+Here is how `build_detection_{train,test}_loader` work:
+
+1. It takes the name of a registered dataset (e.g., "coco_2017_train") and loads a `list[dict]` representing the dataset items
+   in a lightweight, canonical format. These dataset items are not yet ready to be used by the model (e.g., images are
+   not loaded into memory, random augmentations have not been applied, etc.).
+   Details about the dataset format and dataset registration can be found in
+   [datasets](./datasets.md).
+2. Each dict in this list is mapped by a function ("mapper"):
+   * Users can customize this mapping function by specifying the "mapper" argument in
+        `build_detection_{train,test}_loader`. The default mapper is [DatasetMapper](../modules/data.html#detectron2.data.DatasetMapper).
+   * The output format of such function can be arbitrary, as long as it is accepted by the consumer of this data loader (usually the model).
+     The outputs of the default mapper, after batching, follow the default model input format documented in
+     [Use Models](./models.html#model-input-format).
+   * The role of the mapper is to transform the lightweight, canonical representation of a dataset item into a format
+     that is ready for the model to consume (including, e.g., read images, perform random data augmentation and convert to torch Tensors).
+     If you would like to perform custom transformations to data, you often want a custom mapper.
+3. The outputs of the mapper are batched (simply into a list).
+4. This batched data is the output of the data loader. Typically, it's also the input of
+   `model.forward()`.
+
+
+## Write a Custom Dataloader
+
+Using a different "mapper" with `build_detection_{train,test}_loader(mapper=)` works for most use cases
+of custom data loading.
+For example, if you want to resize all images to a fixed size for Mask R-CNN training, write this:
+
+```python
+from detectron2.data import build_detection_train_loader
+from detectron2.data import transforms as T
+from detectron2.data import detection_utils as utils
+
+def mapper(dataset_dict):
+	# Implement a mapper, similar to the default DatasetMapper, but with your own customizations
+	dataset_dict = copy.deepcopy(dataset_dict)  # it will be modified by code below
+	image = utils.read_image(dataset_dict["file_name"], format="BGR")
+	image, transforms = T.apply_transform_gens([T.Resize((800, 800))], image)
+	dataset_dict["image"] = torch.as_tensor(image.transpose(2, 0, 1).astype("float32"))
+
+	annos = [
+		utils.transform_instance_annotations(obj, transforms, image.shape[:2])
+		for obj in dataset_dict.pop("annotations")
+		if obj.get("iscrowd", 0) == 0
+	]
+	instances = utils.annotations_to_instances(annos, image.shape[:2])
+	dataset_dict["instances"] = utils.filter_empty_instances(instances)
+	return dataset_dict
+
+data_loader = build_detection_train_loader(cfg, mapper=mapper)
+# use this dataloader instead of the default
+```
+Refer to [API documentation of detectron2.data](../modules/data) for details.
+
+If you want to change not only the mapper (e.g., to write different sampling or batching logic),
+you can write your own data loader. The data loader is simply a
+python iterator that produces [the format](./models.md) your model accepts.
+You can implement it using any tools you like.
+
+## Use a Custom Dataloader
+
+If you use [DefaultTrainer](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer),
+you can overwrite its `build_{train,test}_loader` method to use your own dataloader.
+See the [densepose dataloader](../../projects/DensePose/train_net.py)
+for an example.
+
+If you write your own training loop, you can plug in your data loader easily.
--- a/docs/tutorials/datasets.md
+++ b/docs/tutorials/datasets.md
+# Use Custom Datasets
+
+Datasets that have builtin support in detectron2 are listed in [datasets](../../datasets).
+If you want to use a custom dataset while also reusing detectron2's data loaders,
+you will need to
+
+1. __Register__ your dataset (i.e., tell detectron2 how to obtain your dataset).
+2. Optionally, __register metadata__ for your dataset.
+
+Next, we explain the above two concepts in detail.
+
+The [Colab tutorial](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
+has a live example of how to register and train on a dataset of custom formats.
+
+### Register a Dataset
+
+To let detectron2 know how to obtain a dataset named "my_dataset", you will implement
+a function that returns the items in your dataset and then tell detectron2 about this
+function:
+```python
+def my_dataset_function():
+  ...
+  return list[dict] in the following format
+
+from detectron2.data import DatasetCatalog
+DatasetCatalog.register("my_dataset", my_dataset_function)
+```
+
+Here, the snippet associates a dataset "my_dataset" with a function that returns the data.
+The registration stays effective until the process exists.
+
+The function can processes data from its original format into either one of the following:
+1. Detectron2's standard dataset dict, described below. This will work with many other builtin
+	 features in detectron2, so it's recommended to use it when it's sufficient for your task.
+2. Your custom dataset dict. You can also return arbitrary dicts in your own format,
+	 such as adding extra keys for new tasks.
+	 Then you will need to handle them properly downstream as well.
+	 See below for more details.
+
+#### Standard Dataset Dicts
+
+For standard tasks
+(instance detection, instance/semantic/panoptic segmentation, keypoint detection),
+we load the original dataset into `list[dict]` with a specification similar to COCO's json annotations.
+This is our standard representation for a dataset.
+
+Each dict contains information about one image.
+The dict may have the following fields,
+and the required fields vary based on what the dataloader or the task needs (see more below).
+
+ `file_name`: the full path to the image file. Will apply rotation and flipping if the image has such exif information.
+ `height`, `width`: integer. The shape of image.
+ `image_id` (str or int): a unique id that identifies this image. Used
+	during evaluation to identify the images, but a dataset may use it for different purposes.
+ `annotations` (list[dict]): each dict corresponds to annotations of one instance
+  in this image. Required by instance detection/segmentation or keypoint detection tasks.
+
+	Images with empty `annotations` will by default be removed from training,
+	but can be included using `DATALOADER.FILTER_EMPTY_ANNOTATIONS`.
+
+	Each dict contains the following keys, of which `bbox`,`bbox_mode` and `category_id` are required:
+  + `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
+  + `bbox_mode` (int): the format of bbox.
+    It must be a member of
+    [structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
+    Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
+  + `category_id` (int): an integer in the range [0, num_categories) representing the category label.
+    The value num_categories is reserved to represent the "background" category, if applicable.
+  + `segmentation` (list[list[float]] or dict): the segmentation mask of the instance.
+    + If `list[list[float]]`, it represents a list of polygons, one for each connected component
+      of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
+      The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
+      depend on whether "bbox_mode" is relative.
+    + If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format. The dict should have
+			keys "size" and "counts". You can convert a uint8 segmentation mask of 0s and 1s into
+			RLE format by `pycocotools.mask.encode(np.asarray(mask, order="F"))`.
+  + `keypoints` (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
+    v[i] means the [visibility](http://cocodataset.org/#format-data) of this keypoint.
+    `n` must be equal to the number of keypoint categories.
+    The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
+    depend on whether "bbox_mode" is relative.
+
+    Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
+    By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
+    pixel indices to floating point coordinates.
+  + `iscrowd`: 0 (default) or 1. Whether this instance is labeled as COCO's "crowd
+    region". Don't include this field if you don't know what it means.
+ `sem_seg_file_name`: the full path to the ground truth semantic segmentation file.
+	Required by semantic segmentation task.
+	It should be an image whose pixel values are integer labels.
+
+
+Fast R-CNN (with precomputed proposals) is rarely used today.
+To train a Fast R-CNN, the following extra keys are needed:
+
+ `proposal_boxes` (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.
+ `proposal_objectness_logits` (array): numpy array with shape (K, ), which corresponds to the objectness
+  logits of proposals in 'proposal_boxes'.
+ `proposal_bbox_mode` (int): the format of the precomputed proposal bbox.
+  It must be a member of
+  [structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
+  Default is `BoxMode.XYXY_ABS`.
+
+#### Custom Dataset Dicts for New Tasks
+
+In the `list[dict]` that your dataset function returns, the dictionary can also have arbitrary custom data.
+This will be useful for a new task that needs extra information not supported
+by the standard dataset dicts. In this case, you need to make sure the downstream code can handle your data
+correctly. Usually this requires writing a new `mapper` for the dataloader (see [Use Custom Dataloaders](./data_loading.md)).
+
+When designing a custom format, note that all dicts are stored in memory
+(sometimes serialized and with multiple copies).
+To save memory, each dict is meant to contain small but sufficient information
+about each sample, such as file names and annotations.
+Loading full samples typically happens in the data loader.
+
+For attributes shared among the entire dataset, use `Metadata` (see below).
+To avoid extra memory, do not save such information repeatly for each sample.
+
+### "Metadata" for Datasets
+
+Each dataset is associated with some metadata, accessible through
+`MetadataCatalog.get(dataset_name).some_metadata`.
+Metadata is a key-value mapping that contains information that's shared among
+the entire dataset, and usually is used to interpret what's in the dataset, e.g.,
+names of classes, colors of classes, root of files, etc.
+This information will be useful for augmentation, evaluation, visualization, logging, etc.
+The structure of metadata depends on the what is needed from the corresponding downstream code.
+
+If you register a new dataset through `DatasetCatalog.register`,
+you may also want to add its corresponding metadata through
+`MetadataCatalog.get(dataset_name).some_key = some_value`, to enable any features that need the metadata.
+You can do it like this (using the metadata key "thing_classes" as an example):
+
+```python
+from detectron2.data import MetadataCatalog
+MetadataCatalog.get("my_dataset").thing_classes = ["person", "dog"]
+```
+
+Here is a list of metadata keys that are used by builtin features in detectron2.
+If you add your own dataset without these metadata, some features may be
+unavailable to you:
+
+* `thing_classes` (list[str]): Used by all instance detection/segmentation tasks.
+  A list of names for each instance/thing category.
+  If you load a COCO format dataset, it will be automatically set by the function `load_coco_json`.
+
+* `thing_colors` (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each thing category.
+  Used for visualization. If not given, random colors are used.
+
+* `stuff_classes` (list[str]): Used by semantic and panoptic segmentation tasks.
+  A list of names for each stuff category.
+
+* `stuff_colors` (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each stuff category.
+  Used for visualization. If not given, random colors are used.
+
+* `keypoint_names` (list[str]): Used by keypoint localization. A list of names for each keypoint.
+
+* `keypoint_flip_map` (list[tuple[str]]): Used by the keypoint localization task. A list of pairs of names,
+  where each pair are the two keypoints that should be flipped if the image is
+  flipped horizontally during augmentation.
+* `keypoint_connection_rules`: list[tuple(str, str, (r, g, b))]. Each tuple specifies a pair of keypoints
+  that are connected and the color to use for the line between them when visualized.
+
+Some additional metadata that are specific to the evaluation of certain datasets (e.g. COCO):
+
+* `thing_dataset_id_to_contiguous_id` (dict[int->int]): Used by all instance detection/segmentation tasks in the COCO format.
+  A mapping from instance class ids in the dataset to contiguous ids in range [0, #class).
+  Will be automatically set by the function `load_coco_json`.
+
+* `stuff_dataset_id_to_contiguous_id` (dict[int->int]): Used when generating prediction json files for
+  semantic/panoptic segmentation.
+  A mapping from semantic segmentation class ids in the dataset
+  to contiguous ids in [0, num_categories). It is useful for evaluation only.
+
+* `json_file`: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.
+* `panoptic_root`, `panoptic_json`: Used by panoptic evaluation.
+* `evaluator_type`: Used by the builtin main training script to select
+   evaluator. Don't use it in a new training script.
+   You can just provide the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
+   for your dataset directly in your main script.
+
+NOTE: For background on the concept of "thing" and "stuff", see
+[On Seeing Stuff: The Perception of Materials by Humans and Machines](http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf).
+In detectron2, the term "thing" is used for instance-level tasks,
+and "stuff" is used for semantic segmentation tasks.
+Both are used in panoptic segmentation.
+
+### Register a COCO Format Dataset
+
+If your dataset is already a json file in the COCO format,
+the dataset and its associated metadata can be registered easily with:
+```python
+from detectron2.data.datasets import register_coco_instances
+register_coco_instances("my_dataset", {}, "json_annotation.json", "path/to/image/dir")
+```
+
+If your dataset is in COCO format but with extra custom per-instance annotations,
+the [load_coco_json](../modules/data.html#detectron2.data.datasets.load_coco_json)
+function might be useful.
+
+### Update the Config for New Datasets
+
+Once you've registered the dataset, you can use the name of the dataset (e.g., "my_dataset" in
+example above) in `cfg.DATASETS.{TRAIN,TEST}`.
+There are other configs you might want to change to train or evaluate on new datasets:
+
+* `MODEL.ROI_HEADS.NUM_CLASSES` and `MODEL.RETINANET.NUM_CLASSES` are the number of thing classes
+	for R-CNN and RetinaNet models, respectively.
+* `MODEL.ROI_KEYPOINT_HEAD.NUM_KEYPOINTS` sets the number of keypoints for Keypoint R-CNN.
+  You'll also need to set [Keypoint OKS](http://cocodataset.org/#keypoints-eval)
+	with `TEST.KEYPOINT_OKS_SIGMAS` for evaluation.
+* `MODEL.SEM_SEG_HEAD.NUM_CLASSES` sets the number of stuff classes for Semantic FPN & Panoptic FPN.
+* If you're training Fast R-CNN (with precomputed proposals), `DATASETS.PROPOSAL_FILES_{TRAIN,TEST}`
+	need to match the datasets. The format of proposal files are documented
+	[here](../modules/data.html#detectron2.data.load_proposals_into_dataset).
+
+New models
+(e.g. [TensorMask](../../projects/TensorMask),
+[PointRend](../../projects/PointRend))
+often have similar configs of their own that need to be changed as well.
--- a/docs/tutorials/deployment.md
+++ b/docs/tutorials/deployment.md
+# Deployment
+
+## Caffe2 Deployment
+We currently support converting a detectron2 model to Caffe2 format through ONNX.
+The converted Caffe2 model is able to run without detectron2 dependency in either Python or C++.
+It has a runtime optimized for CPU & mobile inference, but not for GPU inference.
+
+Caffe2 conversion requires PyTorch ≥ 1.4 and ONNX ≥ 1.6.
+
+### Coverage
+
+It supports 3 most common meta architectures: `GeneralizedRCNN`, `RetinaNet`, `PanopticFPN`,
+and most official models under these 3 meta architectures.
+
+Users' custom extensions under these architectures (added through registration) are supported
+as long as they do not contain control flow or operators not available in Caffe2 (e.g. deformable convolution).
+For example, custom backbones and heads are often supported out of the box.
+
+### Usage
+
+The conversion APIs are documented at [the API documentation](../modules/export).
+We provide a tool, `caffe2_converter.py` as an example that uses
+these APIs to convert a standard model.
+
+To convert an official Mask R-CNN trained on COCO, first
+[prepare the COCO dataset](../../datasets/), then pick the model from [Model Zoo](../../MODEL_ZOO.md), and run:
+```
+cd tools/deploy/ && ./caffe2_converter.py --config-file ../../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
+	--output ./caffe2_model --run-eval \
+	MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl \
+	MODEL.DEVICE cpu
+```
+
+Note that:
+1. The conversion needs valid sample inputs & weights to trace the model. That's why the script requires the dataset.
+	 You can modify the script to obtain sample inputs in other ways.
+2. With the `--run-eval` flag, it will evaluate the converted models to verify its accuracy.
+   The accuracy is typically slightly different (within 0.1 AP) from PyTorch due to
+	 numerical precisions between different implementations.
+	 It's recommended to always verify the accuracy in case your custom model is not supported by the
+	 conversion.
+
+The converted model is available at the specified `caffe2_model/` directory. Two files `model.pb`
+and `model_init.pb` that contain network structure and network parameters are necessary for deployment.
+These files can then be loaded in C++ or Python using Caffe2's APIs.
+
+The script generates `model.svg` file which contains a visualization of the network.
+You can also load `model.pb` to tools such as [netron](https://github.com/lutzroeder/netron) to visualize it.
+
+### Use the model in C++/Python
+
+The model can be loaded in C++. An example [caffe2_mask_rcnn.cpp](../../tools/deploy/) is given,
+which performs CPU/GPU inference using `COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x`.
+
+The C++ example needs to be built with:
+* PyTorch with caffe2 inside
+* gflags, glog, opencv
+* protobuf headers that match the version of your caffe2
+* MKL headers if caffe2 is built with MKL
+
+The following can compile the example inside [official detectron2 docker](../../docker/):
+```
+sudo apt update && sudo apt install libgflags-dev libgoogle-glog-dev libopencv-dev
+pip install mkl-include
+wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protobuf-cpp-3.6.1.tar.gz
+tar xf protobuf-cpp-3.6.1.tar.gz
+export CPATH=$(readlink -f ./protobuf-3.6.1/src/):$HOME/.local/include
+export CMAKE_PREFIX_PATH=$HOME/.local/lib/python3.6/site-packages/torch/
+mkdir build && cd build
+cmake -DTORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST .. && make
+
+# To run:
+./caffe2_mask_rcnn --predict_net=./model.pb --init_net=./model_init.pb --input=input.jpg
+```
+
+Note that:
+
+* All converted models (the .pb files) take two input tensors:
+  "data" is an NCHW image, and "im_info" is an Nx3 tensor consisting of (height, width, 1.0) for
+  each image (the shape of "data" might be larger than that in "im_info" due to padding).
+
+* The converted models do not contain post-processing operations that
+  transform raw layer outputs into formatted predictions.
+  The example only produces raw outputs (28x28 masks) from the final
+  layers that are not post-processed, because in actual deployment, an application often needs
+  its custom lightweight post-processing (e.g. full-image masks for every detected object is often not necessary).
+
+We also provide a python wrapper around the converted model, in the
+[Caffe2Model.\_\_call\_\_](../modules/export.html#detectron2.export.Caffe2Model.__call__) method.
+This method has an interface that's identical to the [pytorch versions of models](./models.md),
+and it internally applies pre/post-processing code to match the formats.
+They can serve as a reference for pre/post-processing in actual deployment.
--- a/docs/tutorials/evaluation.md
+++ b/docs/tutorials/evaluation.md
+
+# Evaluation
+
+Evaluation is a process that takes a number of inputs/outputs pairs and aggregate them.
+You can always [use the model](./models.md) directly and just parse its inputs/outputs manually to perform
+evaluation.
+Alternatively, evaluation is implemented in detectron2 using the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
+interface.
+
+Detectron2 includes a few `DatasetEvaluator` that computes metrics using standard dataset-specific
+APIs (e.g., COCO, LVIS).
+You can also implement your own `DatasetEvaluator` that performs some other jobs
+using the inputs/outputs pairs.
+For example, to count how many instances are detected on the validation set:
+
+```
+class Counter(DatasetEvaluator):
+  def reset(self):
+    self.count = 0
+  def process(self, inputs, outputs):
+    for output in outputs:
+      self.count += len(output["instances"])
+  def evaluate(self):
+    # save self.count somewhere, or print it, or return it.
+    return {"count": self.count}
+```
+
+Once you have some `DatasetEvaluator`, you can run it with
+[inference_on_dataset](../modules/evaluation.html#detectron2.evaluation.inference_on_dataset).
+For example,
+
+```python
+val_results = inference_on_dataset(
+    model,
+    val_data_loader,
+    DatasetEvaluators([COCOEvaluator(...), Counter()]))
+```
+Compared to running the evaluation manually using the model, the benefit of this function is that
+you can merge evaluators together using [DatasetEvaluators](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluators).
+In this way you can run all evaluations without having to go through the dataset multiple times.
+
+The `inference_on_dataset` function also provides accurate speed benchmarks for the
+given model and dataset.
--- a/docs/tutorials/extend.md
+++ b/docs/tutorials/extend.md
+# Extend Detectron2's Defaults
+
+__Research is about doing things in new ways__.
+This brings a tension in how to create abstractions in code,
+which is a challenge for any research engineering project of a significant size:
+
+1. On one hand, it needs to have very thin abstractions to allow for the possibility of doing
+   everything in new ways. It should be reasonably easy to break existing
+   abstractions and replace them with new ones.
+
+2. On the other hand, such a project also needs reasonably high-level
+   abstractions, so that users can easily do things in standard ways,
+   without worrying too much about the details that only certain researchers care about.
+
+In detectron2, there are two types of interfaces that address this tension together:
+
+1. Functions and classes that take a config (`cfg`) argument
+   (sometimes with only a few extra arguments).
+
+   Such functions and classes implement
+   the "standard default" behavior: it will read what it needs from the
+   config and do the "standard" thing.
+   Users only need to load a given config and pass it around, without having to worry about
+   which arguments are used and what they all mean.
+
+2. Functions and classes that have well-defined explicit arguments.
+
+   Each of these is a small building block of the entire system.
+   They require users' expertise to understand what each argument should be,
+   and require more effort to stitch together to a larger system.
+   But they can be stitched together in more flexible ways.
+
+   When you need to implement something not supported by the "standard defaults"
+   included in detectron2, these well-defined components can be reused.
+
+3. (experimental) A few classes are implemented with the
+   [@configurable](../../modules/config.html#detectron2.config.configurable)
+   decorator - they can be called with either a config, or with explicit arguments.
+   Their explicit argument interfaces are currently __experimental__ and subject to change.
+
+
+If you only need the standard behavior, the [Beginner's Tutorial](./getting_started.md)
+should suffice. If you need to extend detectron2 to your own needs,
+see the following tutorials for more details:
+
+* Detectron2 includes a few standard datasets. To use custom ones, see
+  [Use Custom Datasets](./datasets.md).
+* Detectron2 contains the standard logic that creates a data loader for training/testing from a
+  dataset, but you can write your own as well. See [Use Custom Data Loaders](./data_loading.md).
+* Detectron2 implements many standard detection models, and provide ways for you
+  to overwrite their behaviors. See [Use Models](./models.md) and [Write Models](./write-models.md).
+* Detectron2 provides a default training loop that is good for common training tasks.
+  You can customize it with hooks, or write your own loop instead. See [training](./training.md).
--- a/docs/tutorials/getting_started.md
+++ b/docs/tutorials/getting_started.md
+../../GETTING_STARTED.md
\ No newline at end of file
--- a/docs/tutorials/index.rst
+++ b/docs/tutorials/index.rst
+Tutorials
+======================================
+
+.. toctree::
+   :maxdepth: 2
+
+   install
+   getting_started
+   builtin_datasets
+   extend
+   datasets
+   data_loading
+   models
+   write-models
+   training
+   evaluation
+   configs
+   deployment
--- a/docs/tutorials/install.md
+++ b/docs/tutorials/install.md
+../../INSTALL.md
\ No newline at end of file
--- a/docs/tutorials/models.md
+++ b/docs/tutorials/models.md
+# Use Models
+
+Models (and their sub-models) in detectron2 are built by
+functions such as `build_model`, `build_backbone`, `build_roi_heads`:
+```python
+from detectron2.modeling import build_model
+model = build_model(cfg)  # returns a torch.nn.Module
+```
+
+`build_model` only builds the model structure, and fill it with random parameters.
+See below for how to load an existing checkpoint to the model,
+and how to use the `model` object.
+
+### Load/Save a Checkpoint
+```python
+from detectron2.checkpoint import DetectionCheckpointer
+DetectionCheckpointer(model).load(file_path)   # load a file to model
+
+checkpointer = DetectionCheckpointer(model, save_dir="output")
+checkpointer.save("model_999")  # save to output/model_999.pth
+```
+
+Detectron2's checkpointer recognizes models in pytorch's `.pth` format, as well as the `.pkl` files
+in our model zoo.
+See [API doc](../modules/checkpoint.html#detectron2.checkpoint.DetectionCheckpointer)
+for more details about its usage.
+
+The model files can be arbitrarily manipulated using `torch.{load,save}` for `.pth` files or
+`pickle.{dump,load}` for `.pkl` files.
+
+### Use a Model
+
+A model can be called by `outputs = model(inputs)`, where `inputs` is a `list[dict]`.
+Each dict corresponds to one image and the required keys
+depend on the type of model, and whether the model is in training or evaluation mode.
+For example, in order to do inference,
+all existing models expect the "image" key, and optionally "height" and "width".
+The detailed format of inputs and outputs of existing models are explained below.
+
+When in training mode, all models are required to be used under an `EventStorage`.
+The training statistics will be put into the storage:
+```python
+from detectron2.utils.events import EventStorage
+with EventStorage() as storage:
+  losses = model(inputs)
+```
+
+If you only want to do simple inference using an existing model,
+[DefaultPredictor](../modules/engine.html#detectron2.engine.defaults.DefaultPredictor)
+is a wrapper around model that provides such basic functionality.
+It includes default behavior including model loading, preprocessing,
+and operates on single image rather than batches.
+
+### Model Input Format
+
+Users can implement custom models that support any arbitrary input format.
+Here we describe the standard input format that all builtin models support in detectron2.
+They all take a `list[dict]` as the inputs. Each dict
+corresponds to information about one image.
+
+The dict may contain the following keys:
+
+* "image": `Tensor` in (C, H, W) format. The meaning of channels are defined by `cfg.INPUT.FORMAT`.
+  Image normalization, if any, will be performed inside the model using
+	`cfg.MODEL.PIXEL_{MEAN,STD}`.
+* "instances": an [Instances](../modules/structures.html#detectron2.structures.Instances)
+  object, with the following fields:
+  + "gt_boxes": a [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing N boxes, one for each instance.
+  + "gt_classes": `Tensor` of long type, a vector of N labels, in range [0, num_categories).
+  + "gt_masks": a [PolygonMasks](../modules/structures.html#detectron2.structures.PolygonMasks)
+    or [BitMasks](../modules/structures.html#detectron2.structures.BitMasks) object storing N masks, one for each instance.
+  + "gt_keypoints": a [Keypoints](../modules/structures.html#detectron2.structures.Keypoints)
+    object storing N keypoint sets, one for each instance.
+* "proposals": an [Instances](../modules/structures.html#detectron2.structures.Instances)
+  object used only in Fast R-CNN style models, with the following fields:
+  + "proposal_boxes": a [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing P proposal boxes.
+  + "objectness_logits": `Tensor`, a vector of P scores, one for each proposal.
+* "height", "width": the **desired** output height and width, which is not necessarily the same
+  as the height or width of the `image` input field.
+  For example, the `image` input field might be a resized image,
+  but you may want the outputs to be in **original** resolution.
+
+  If provided, the model will produce output in this resolution,
+  rather than in the resolution of the `image` as input into the model. This is more efficient and accurate.
+* "sem_seg": `Tensor[int]` in (H, W) format. The semantic segmentation ground truth.
+  Values represent category labels starting from 0.
+
+
+#### How it connects to data loader:
+
+The output of the default [DatasetMapper]( ../modules/data.html#detectron2.data.DatasetMapper) is a dict
+that follows the above format.
+After the data loader performs batching, it becomes `list[dict]` which the builtin models support.
+
+
+### Model Output Format
+
+When in training mode, the builtin models output a `dict[str->ScalarTensor]` with all the losses.
+
+When in inference mode, the builtin models output a `list[dict]`, one dict for each image.
+Based on the tasks the model is doing, each dict may contain the following fields:
+
+* "instances": [Instances](../modules/structures.html#detectron2.structures.Instances)
+  object with the following fields:
+  * "pred_boxes": [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing N boxes, one for each detected instance.
+  * "scores": `Tensor`, a vector of N scores.
+  * "pred_classes": `Tensor`, a vector of N labels in range [0, num_categories).
+  + "pred_masks": a `Tensor` of shape (N, H, W), masks for each detected instance.
+  + "pred_keypoints": a `Tensor` of shape (N, num_keypoint, 3).
+    Each row in the last dimension is (x, y, score). Scores are larger than 0.
+* "sem_seg": `Tensor` of (num_categories, H, W), the semantic segmentation prediction.
+* "proposals": [Instances](../modules/structures.html#detectron2.structures.Instances)
+  object with the following fields:
+  * "proposal_boxes": [Boxes](../modules/structures.html#detectron2.structures.Boxes)
+    object storing N boxes.
+  * "objectness_logits": a torch vector of N scores.
+* "panoptic_seg": A tuple of `(Tensor, list[dict])`. The tensor has shape (H, W), where each element
+  represent the segment id of the pixel. Each dict describes one segment id and has the following fields:
+  * "id": the segment id
+  * "isthing": whether the segment is a thing or stuff
+  * "category_id": the category id of this segment. It represents the thing
+       class id when `isthing==True`, and the stuff class id otherwise.
+
+
+### Partially execute a model:
+
+Sometimes you may want to obtain an intermediate tensor inside a model.
+Since there are typically hundreds of intermediate tensors, there isn't an API that provides you
+the intermediate result you need.
+You have the following options:
+
+1. Write a (sub)model. Following the [tutorial](./write-models.md), you can
+   rewrite a model component (e.g. a head of a model), such that it
+   does the same thing as the existing component, but returns the output
+   you need.
+2. Partially execute a model. You can create the model as usual,
+   but use custom code to execute it instead of its `forward()`. For example,
+   the following code obtains mask features before mask head.
+
+```python
+images = ImageList.from_tensors(...)  # preprocessed input tensor
+model = build_model(cfg)
+features = model.backbone(images.tensor)
+proposals, _ = model.proposal_generator(images, features)
+instances = model.roi_heads._forward_box(features, proposals)
+mask_features = [features[f] for f in model.roi_heads.in_features]
+mask_features = model.roi_heads.mask_pooler(mask_features, [x.pred_boxes for x in instances])
+```
+
+Note that both options require you to read the existing forward code to understand
+how to write code to obtain the outputs you need.
--- a/docs/tutorials/training.md
+++ b/docs/tutorials/training.md
+# Training
+
+From the previous tutorials, you may now have a custom model and data loader.
+
+You are free to create your own optimizer, and write the training logic: it's
+usually easy with PyTorch, and allow researchers to see the entire training
+logic more clearly and have full control.
+One such example is provided in [tools/plain_train_net.py](../../tools/plain_train_net.py).
+
+We also provide a standarized "trainer" abstraction with a
+[minimal hook system](../modules/engine.html#detectron2.engine.HookBase)
+that helps simplify the standard types of training.
+
+You can use
+[SimpleTrainer().train()](../modules/engine.html#detectron2.engine.SimpleTrainer)
+which provides minimal abstraction for single-cost single-optimizer single-data-source training.
+The builtin `train_net.py` script uses
+[DefaultTrainer().train()](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer),
+which includes more standard default behavior that one might want to opt in,
+including default configurations for learning rate schedule,
+logging, evaluation, checkpointing etc.
+This also means that it's less likely to support some non-standard behavior
+you might want during research.
+
+To customize the training loops, you can:
+
+1. If your customization is similar to what `DefaultTrainer` is already doing,
+you can change behavior of `DefaultTrainer` by overwriting [its methods](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer)
+in a subclass, like what [tools/train_net.py](../../tools/train_net.py) does.
+2. If you need something very novel, you can start from [tools/plain_train_net.py](../../tools/plain_train_net.py) to implement them yourself.
+
+### Logging of Metrics
+
+During training, metrics are saved to a centralized [EventStorage](../modules/utils.html#detectron2.utils.events.EventStorage).
+You can use the following code to access it and log metrics to it:
+```
+from detectron2.utils.events import get_event_storage
+
+# inside the model:
+if self.training:
+  value = # compute the value from inputs
+  storage = get_event_storage()
+  storage.put_scalar("some_accuracy", value)
+```
+
+Refer to its documentation for more details.
+
+Metrics are then saved to various destinations with [EventWriter](../modules/utils.html#module-detectron2.utils.events).
+DefaultTrainer enables a few `EventWriter` with default configurations.
+See above for how to customize them.
--- a/docs/tutorials/write-models.md
+++ b/docs/tutorials/write-models.md
+# Write Models
+
+If you are trying to do something completely new, you may wish to implement
+a model entirely from scratch within detectron2. However, in many situations you may
+be interested in modifying or extending some components of an existing model.
+Therefore, we also provide a registration mechanism that lets you override the
+behavior of certain internal components of standard models.
+
+For example, to add a new backbone, import this code in your code:
+```python
+from detectron2.modeling import BACKBONE_REGISTRY, Backbone, ShapeSpec
+
+@BACKBONE_REGISTRY.register()
+class ToyBackBone(Backbone):
+  def __init__(self, cfg, input_shape):
+    # create your own backbone
+    self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=16, padding=3)
+
+  def forward(self, image):
+    return {"conv1": self.conv1(image)}
+
+  def output_shape(self):
+    return {"conv1": ShapeSpec(channels=64, stride=16)}
+```
+Then, you can use `cfg.MODEL.BACKBONE.NAME = 'ToyBackBone'` in your config object.
+`build_model(cfg)` will then call your `ToyBackBone` instead.
+
+As another example, to add new abilities to the ROI heads in the Generalized R-CNN meta-architecture,
+you can implement a new
+[ROIHeads](../modules/modeling.html#detectron2.modeling.ROIHeads) subclass and put it in the `ROI_HEADS_REGISTRY`.
+See [densepose in detectron2](../../projects/DensePose)
+and [meshrcnn](https://github.com/facebookresearch/meshrcnn)
+for examples that implement new ROIHeads to perform new tasks.
+And [projects/](../../projects/)
+contains more examples that implement different architectures.
+
+A complete list of registries can be found in [API documentation](../modules/modeling.html#model-registries).
+You can register components in these registries to customize different parts of a model, or the
+entire model.
--- a/projects/DensePose/README.md
+++ b/projects/DensePose/README.md
+# DensePose in Detectron2
+**Dense Human Pose Estimation In The Wild**
+
+_Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos_
+
+[[`densepose.org`](https://densepose.org)] [[`arXiv`](https://arxiv.org/abs/1802.00434)] [[`BibTeX`](#CitingDensePose)]
+
+Dense human pose estimation aims at mapping all human pixels of an RGB image to the 3D surface of the human body.
+
+<div align="center">
+  <img src="https://drive.google.com/uc?export=view&id=1qfSOkpueo1kVZbXOuQJJhyagKjMgepsz" width="700px" />
+</div>
+
+In this repository, we provide the code to train and evaluate DensePose-RCNN. We also provide tools to visualize
+DensePose annotation and results.
+
+# Quick Start
+
+See [ Getting Started ](doc/GETTING_STARTED.md)
+
+# Model Zoo and Baselines
+
+We provide a number of baseline results and trained models available for download. See [Model Zoo](doc/MODEL_ZOO.md) for details.
+
+# License
+
+Detectron2 is released under the [Apache 2.0 license](../../LICENSE)
+
+## <a name="CitingDensePose"></a>Citing DensePose
+
+If you use DensePose, please take the references from the following BibTeX entries:
+
+For DensePose with estimated confidences:
+
+```
+@InProceedings{Neverova2019DensePoseConfidences,
+    title = {Correlated Uncertainty for Learning Dense Correspondences from Noisy Labels},
+    author = {Neverova, Natalia and Novotny, David and Vedaldi, Andrea},
+    journal = {Advances in Neural Information Processing Systems},
+    year = {2019},
+}
+```
+
+For the original DensePose:
+
+```
+@InProceedings{Guler2018DensePose,
+  title={DensePose: Dense Human Pose Estimation In The Wild},
+  author={R\{i}za Alp G\"uler, Natalia Neverova, Iasonas Kokkinos},
+  journal={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+  year={2018}
+}
+```
+
--- a/projects/DensePose/apply_net.py
+++ b/projects/DensePose/apply_net.py
+#!/usr/bin/env python3
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+import argparse
+import glob
+import logging
+import os
+import pickle
+import sys
+from typing import Any, ClassVar, Dict, List
+import torch
+
+from detectron2.config import get_cfg
+from detectron2.data.detection_utils import read_image
+from detectron2.engine.defaults import DefaultPredictor
+from detectron2.structures.boxes import BoxMode
+from detectron2.structures.instances import Instances
+from detectron2.utils.logger import setup_logger
+
+from densepose import add_densepose_config
+from densepose.utils.logger import verbosity_to_level
+from densepose.vis.base import CompoundVisualizer
+from densepose.vis.bounding_box import ScoredBoundingBoxVisualizer
+from densepose.vis.densepose import (
+    DensePoseResultsContourVisualizer,
+    DensePoseResultsFineSegmentationVisualizer,
+    DensePoseResultsUVisualizer,
+    DensePoseResultsVVisualizer,
+)
+from densepose.vis.extractor import CompoundExtractor, create_extractor
+
+DOC = """Apply Net - a tool to print / visualize DensePose results
+"""
+
+LOGGER_NAME = "apply_net"
+logger = logging.getLogger(LOGGER_NAME)
+
+_ACTION_REGISTRY: Dict[str, "Action"] = {}
+
+
+class Action(object):
+    @classmethod
+    def add_arguments(cls: type, parser: argparse.ArgumentParser):
+        parser.add_argument(
+            "-v",
+            "--verbosity",
+            action="count",
+            help="Verbose mode. Multiple -v options increase the verbosity.",
+        )
+
+
+def register_action(cls: type):
+    """
+    Decorator for action classes to automate action registration
+    """
+    global _ACTION_REGISTRY
+    _ACTION_REGISTRY[cls.COMMAND] = cls
+    return cls
+
+
+class InferenceAction(Action):
+    @classmethod
+    def add_arguments(cls: type, parser: argparse.ArgumentParser):
+        super(InferenceAction, cls).add_arguments(parser)
+        parser.add_argument("cfg", metavar="<config>", help="Config file")
+        parser.add_argument("model", metavar="<model>", help="Model file")
+        parser.add_argument("input", metavar="<input>", help="Input data")
+        parser.add_argument(
+            "--opts",
+            help="Modify config options using the command-line 'KEY VALUE' pairs",
+            default=[],
+            nargs=argparse.REMAINDER,
+        )
+
+    @classmethod
+    def execute(cls: type, args: argparse.Namespace):
+        logger.info(f"Loading config from {args.cfg}")
+        opts = []
+        cfg = cls.setup_config(args.cfg, args.model, args, opts)
+        logger.info(f"Loading model from {args.model}")
+        predictor = DefaultPredictor(cfg)
+        logger.info(f"Loading data from {args.input}")
+        file_list = cls._get_input_file_list(args.input)
+        if len(file_list) == 0:
+            logger.warning(f"No input images for {args.input}")
+            return
+        context = cls.create_context(args)
+        for file_name in file_list:
+            img = read_image(file_name, format="BGR")  # predictor expects BGR image.
+            with torch.no_grad():
+                outputs = predictor(img)["instances"]
+                cls.execute_on_outputs(context, {"file_name": file_name, "image": img}, outputs)
+        cls.postexecute(context)
+
+    @classmethod
+    def setup_config(
+        cls: type, config_fpath: str, model_fpath: str, args: argparse.Namespace, opts: List[str]
+    ):
+        cfg = get_cfg()
+        add_densepose_config(cfg)
+        cfg.merge_from_file(config_fpath)
+        cfg.merge_from_list(args.opts)
+        if opts:
+            cfg.merge_from_list(opts)
+        cfg.MODEL.WEIGHTS = model_fpath
+        cfg.freeze()
+        return cfg
+
+    @classmethod
+    def _get_input_file_list(cls: type, input_spec: str):
+        if os.path.isdir(input_spec):
+            file_list = [
+                os.path.join(input_spec, fname)
+                for fname in os.listdir(input_spec)
+                if os.path.isfile(os.path.join(input_spec, fname))
+            ]
+        elif os.path.isfile(input_spec):
+            file_list = [input_spec]
+        else:
+            file_list = glob.glob(input_spec)
+        return file_list
+
+
+@register_action
+class DumpAction(InferenceAction):
+    """
+    Dump action that outputs results to a pickle file
+    """
+
+    COMMAND: ClassVar[str] = "dump"
+
+    @classmethod
+    def add_parser(cls: type, subparsers: argparse._SubParsersAction):
+        parser = subparsers.add_parser(cls.COMMAND, help="Dump model outputs to a file.")
+        cls.add_arguments(parser)
+        parser.set_defaults(func=cls.execute)
+
+    @classmethod
+    def add_arguments(cls: type, parser: argparse.ArgumentParser):
+        super(DumpAction, cls).add_arguments(parser)
+        parser.add_argument(
+            "--output",
+            metavar="<dump_file>",
+            default="results.pkl",
+            help="File name to save dump to",
+        )
+
+    @classmethod
+    def execute_on_outputs(
+        cls: type, context: Dict[str, Any], entry: Dict[str, Any], outputs: Instances
+    ):
+        image_fpath = entry["file_name"]
+        logger.info(f"Processing {image_fpath}")
+        result = {"file_name": image_fpath}
+        if outputs.has("scores"):
+            result["scores"] = outputs.get("scores").cpu()
+        if outputs.has("pred_boxes"):
+            result["pred_boxes_XYXY"] = outputs.get("pred_boxes").tensor.cpu()
+            if outputs.has("pred_densepose"):
+                boxes_XYWH = BoxMode.convert(
+                    result["pred_boxes_XYXY"], BoxMode.XYXY_ABS, BoxMode.XYWH_ABS
+                )
+                result["pred_densepose"] = outputs.get("pred_densepose").to_result(boxes_XYWH)
+        context["results"].append(result)
+
+    @classmethod
+    def create_context(cls: type, args: argparse.Namespace):
+        context = {"results": [], "out_fname": args.output}
+        return context
+
+    @classmethod
+    def postexecute(cls: type, context: Dict[str, Any]):
+        out_fname = context["out_fname"]
+        out_dir = os.path.dirname(out_fname)
+        if len(out_dir) > 0 and not os.path.exists(out_dir):
+            os.makedirs(out_dir)
+        with open(out_fname, "wb") as hFile:
+            pickle.dump(context["results"], hFile)
+            logger.info(f"Output saved to {out_fname}")
+
+
+@register_action
+class ShowAction(InferenceAction):
+    """
+    Show action that visualizes selected entries on an image
+    """
+
+    COMMAND: ClassVar[str] = "show"
+    VISUALIZERS: ClassVar[Dict[str, object]] = {
+        "dp_contour": DensePoseResultsContourVisualizer,
+        "dp_segm": DensePoseResultsFineSegmentationVisualizer,
+        "dp_u": DensePoseResultsUVisualizer,
+        "dp_v": DensePoseResultsVVisualizer,
+        "bbox": ScoredBoundingBoxVisualizer,
+    }
+
+    @classmethod
+    def add_parser(cls: type, subparsers: argparse._SubParsersAction):
+        parser = subparsers.add_parser(cls.COMMAND, help="Visualize selected entries")
+        cls.add_arguments(parser)
+        parser.set_defaults(func=cls.execute)
+
+    @classmethod
+    def add_arguments(cls: type, parser: argparse.ArgumentParser):
+        super(ShowAction, cls).add_arguments(parser)
+        parser.add_argument(
+            "visualizations",
+            metavar="<visualizations>",
+            help="Comma separated list of visualizations, possible values: "
+            "[{}]".format(",".join(sorted(cls.VISUALIZERS.keys()))),
+        )
+        parser.add_argument(
+            "--min_score",
+            metavar="<score>",
+            default=0.8,
+            type=float,
+            help="Minimum detection score to visualize",
+        )
+        parser.add_argument(
+            "--nms_thresh", metavar="<threshold>", default=None, type=float, help="NMS threshold"
+        )
+        parser.add_argument(
+            "--output",
+            metavar="<image_file>",
+            default="outputres.png",
+            help="File name to save output to",
+        )
+
+    @classmethod
+    def setup_config(
+        cls: type, config_fpath: str, model_fpath: str, args: argparse.Namespace, opts: List[str]
+    ):
+        opts.append("MODEL.ROI_HEADS.SCORE_THRESH_TEST")
+        opts.append(str(args.min_score))
+        if args.nms_thresh is not None:
+            opts.append("MODEL.ROI_HEADS.NMS_THRESH_TEST")
+            opts.append(str(args.nms_thresh))
+        cfg = super(ShowAction, cls).setup_config(config_fpath, model_fpath, args, opts)
+        return cfg
+
+    @classmethod
+    def execute_on_outputs(
+        cls: type, context: Dict[str, Any], entry: Dict[str, Any], outputs: Instances
+    ):
+        import cv2
+        import numpy as np
+
+        visualizer = context["visualizer"]
+        extractor = context["extractor"]
+        image_fpath = entry["file_name"]
+        logger.info(f"Processing {image_fpath}")
+        image = cv2.cvtColor(entry["image"], cv2.COLOR_BGR2GRAY)
+        image = np.tile(image[:, :, np.newaxis], [1, 1, 3])
+        data = extractor(outputs)
+        image_vis = visualizer.visualize(image, data)
+        entry_idx = context["entry_idx"] + 1
+        out_fname = cls._get_out_fname(entry_idx, context["out_fname"])
+        out_dir = os.path.dirname(out_fname)
+        if len(out_dir) > 0 and not os.path.exists(out_dir):
+            os.makedirs(out_dir)
+        cv2.imwrite(out_fname, image_vis)
+        logger.info(f"Output saved to {out_fname}")
+        context["entry_idx"] += 1
+
+    @classmethod
+    def postexecute(cls: type, context: Dict[str, Any]):
+        pass
+
+    @classmethod
+    def _get_out_fname(cls: type, entry_idx: int, fname_base: str):
+        base, ext = os.path.splitext(fname_base)
+        return base + ".{0:04d}".format(entry_idx) + ext
+
+    @classmethod
+    def create_context(cls: type, args: argparse.Namespace) -> Dict[str, Any]:
+        vis_specs = args.visualizations.split(",")
+        visualizers = []
+        extractors = []
+        for vis_spec in vis_specs:
+            vis = cls.VISUALIZERS[vis_spec]()
+            visualizers.append(vis)
+            extractor = create_extractor(vis)
+            extractors.append(extractor)
+        visualizer = CompoundVisualizer(visualizers)
+        extractor = CompoundExtractor(extractors)
+        context = {
+            "extractor": extractor,
+            "visualizer": visualizer,
+            "out_fname": args.output,
+            "entry_idx": 0,
+        }
+        return context
+
+
+def create_argument_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        description=DOC,
+        formatter_class=lambda prog: argparse.HelpFormatter(prog, max_help_position=120),
+    )
+    parser.set_defaults(func=lambda _: parser.print_help(sys.stdout))
+    subparsers = parser.add_subparsers(title="Actions")
+    for _, action in _ACTION_REGISTRY.items():
+        action.add_parser(subparsers)
+    return parser
+
+
+def main():
+    parser = create_argument_parser()
+    args = parser.parse_args()
+    verbosity = args.verbosity if hasattr(args, "verbosity") else None
+    global logger
+    logger = setup_logger(name=LOGGER_NAME)
+    logger.setLevel(verbosity_to_level(verbosity))
+    args.func(args)
+
+
+if __name__ == "__main__":
+    main()
--- a/projects/DensePose/configs/Base-DensePose-RCNN-FPN.yaml
+++ b/projects/DensePose/configs/Base-DensePose-RCNN-FPN.yaml
+MODEL:
+  META_ARCHITECTURE: "GeneralizedRCNN"
+  BACKBONE:
+    NAME: "build_resnet_fpn_backbone"
+  RESNETS:
+    OUT_FEATURES: ["res2", "res3", "res4", "res5"]
+  FPN:
+    IN_FEATURES: ["res2", "res3", "res4", "res5"]
+  ANCHOR_GENERATOR:
+    SIZES: [[32], [64], [128], [256], [512]]  # One size for each in feature map
+    ASPECT_RATIOS: [[0.5, 1.0, 2.0]]  # Three aspect ratios (same for all in feature maps)
+  RPN:
+    IN_FEATURES: ["p2", "p3", "p4", "p5", "p6"]
+    PRE_NMS_TOPK_TRAIN: 2000  # Per FPN level
+    PRE_NMS_TOPK_TEST: 1000  # Per FPN level
+    # Detectron1 uses 2000 proposals per-batch,
+    # (See "modeling/rpn/rpn_outputs.py" for details of this legacy issue)
+    # which is approximately 1000 proposals per-image since the default batch size for FPN is 2.
+    POST_NMS_TOPK_TRAIN: 1000
+    POST_NMS_TOPK_TEST: 1000
+
+  DENSEPOSE_ON: True
+  ROI_HEADS:
+    NAME: "DensePoseROIHeads"
+    IN_FEATURES: ["p2", "p3", "p4", "p5"]
+    NUM_CLASSES: 1
+  ROI_BOX_HEAD:
+    NAME: "FastRCNNConvFCHead"
+    NUM_FC: 2
+    POOLER_RESOLUTION: 7
+    POOLER_SAMPLING_RATIO: 2
+    POOLER_TYPE: "ROIAlign"
+  ROI_DENSEPOSE_HEAD:
+    NAME: "DensePoseV1ConvXHead"
+    POOLER_TYPE: "ROIAlign"
+    NUM_COARSE_SEGM_CHANNELS: 2
+DATASETS:
+  TRAIN: ("densepose_coco_2014_train", "densepose_coco_2014_valminusminival")
+  TEST: ("densepose_coco_2014_minival",)
+SOLVER:
+  IMS_PER_BATCH: 16
+  BASE_LR: 0.01
+  STEPS: (60000, 80000)
+  MAX_ITER: 90000
+  WARMUP_FACTOR: 0.1
+INPUT:
+  MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800)