Commit c732df65 authored by limm's avatar limm
Browse files

push v0.1.3 version commit bd2ea47

parent 5b3792fc
Pipeline #706 failed with stages
in 0 seconds
../../.github/CONTRIBUTING.md
\ No newline at end of file
Notes
======================================
.. toctree::
:maxdepth: 2
benchmarks
compatibility
contributing
changelog
termcolor
numpy
tqdm
docutils==0.16
Sphinx==3.0.0
recommonmark==0.6.0
sphinx_rtd_theme
mock
matplotlib
termcolor
yacs
tabulate
cloudpickle
Pillow==6.2.2
future
requests
six
git+git://github.com/facebookresearch/fvcore.git
https://download.pytorch.org/whl/cpu/torch-1.5.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
https://download.pytorch.org/whl/cpu/torchvision-0.6.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
# Read the docs:
The latest documentation built from this directory is available at [detectron2.readthedocs.io](https://detectron2.readthedocs.io/).
Documents in this directory are not meant to be read on github.
../../datasets/README.md
\ No newline at end of file
# Configs
Detectron2 provides a key-value based config system that can be
used to obtain standard, common behaviors.
Detectron2's config system uses YAML and [yacs](https://github.com/rbgirshick/yacs).
In addition to the [basic operations](../modules/config.html#detectron2.config.CfgNode)
that access and update a config, we provide the following extra functionalities:
1. The config can have `_BASE_: base.yaml` field, which will load a base config first.
Values in the base config will be overwritten in sub-configs, if there are any conflicts.
We provided several base configs for standard model architectures.
2. We provide config versioning, for backward compatibility.
If your config file is versioned with a config line like `VERSION: 2`,
detectron2 will still recognize it even if we change some keys in the future.
"Config" is a very limited abstraction.
We do not expect all features in detectron2 to be available through configs.
If you need something that's not available in the config space,
please write code using detectron2's API.
### Basic Usage
Some basic usage of the `CfgNode` object is shown here. See more in [documentation](../modules/config.html#detectron2.config.CfgNode).
```python
from detectron2.config import get_cfg
cfg = get_cfg() # obtain detectron2's default config
cfg.xxx = yyy # add new configs for your own custom components
cfg.merge_from_file("my_cfg.yaml") # load values from a file
cfg.merge_from_list(["MODEL.WEIGHTS", "weights.pth"]) # can also load values from a list of str
print(cfg.dump()) # print formatted configs
```
Many builtin tools in detectron2 accepts command line config overwrite:
Key-value pairs provided in the command line will overwrite the existing values in the config file.
For example, [demo.py](../../demo/demo.py) can be used with
```
./demo.py --config-file config.yaml [--other-options] \
--opts MODEL.WEIGHTS /path/to/weights INPUT.MIN_SIZE_TEST 1000
```
To see a list of available configs in detectron2 and what they mean,
check [Config References](../modules/config.html#config-references)
### Best Practice with Configs
1. Treat the configs you write as "code": avoid copying them or duplicating them; use `_BASE_`
to share common parts between configs.
2. Keep the configs you write simple: don't include keys that do not affect the experimental setting.
3. Keep a version number in your configs (or the base config), e.g., `VERSION: 2`,
for backward compatibility.
We print a warning when reading a config without version number.
The official configs do not include version number because they are meant to
be always up-to-date.
# Use Custom Dataloaders
## How the Existing Dataloader Works
Detectron2 contains a builtin data loading pipeline.
It's good to understand how it works, in case you need to write a custom one.
Detectron2 provides two functions
[build_detection_{train,test}_loader](../modules/data.html#detectron2.data.build_detection_train_loader)
that create a default data loader from a given config.
Here is how `build_detection_{train,test}_loader` work:
1. It takes the name of a registered dataset (e.g., "coco_2017_train") and loads a `list[dict]` representing the dataset items
in a lightweight, canonical format. These dataset items are not yet ready to be used by the model (e.g., images are
not loaded into memory, random augmentations have not been applied, etc.).
Details about the dataset format and dataset registration can be found in
[datasets](./datasets.md).
2. Each dict in this list is mapped by a function ("mapper"):
* Users can customize this mapping function by specifying the "mapper" argument in
`build_detection_{train,test}_loader`. The default mapper is [DatasetMapper](../modules/data.html#detectron2.data.DatasetMapper).
* The output format of such function can be arbitrary, as long as it is accepted by the consumer of this data loader (usually the model).
The outputs of the default mapper, after batching, follow the default model input format documented in
[Use Models](./models.html#model-input-format).
* The role of the mapper is to transform the lightweight, canonical representation of a dataset item into a format
that is ready for the model to consume (including, e.g., read images, perform random data augmentation and convert to torch Tensors).
If you would like to perform custom transformations to data, you often want a custom mapper.
3. The outputs of the mapper are batched (simply into a list).
4. This batched data is the output of the data loader. Typically, it's also the input of
`model.forward()`.
## Write a Custom Dataloader
Using a different "mapper" with `build_detection_{train,test}_loader(mapper=)` works for most use cases
of custom data loading.
For example, if you want to resize all images to a fixed size for Mask R-CNN training, write this:
```python
from detectron2.data import build_detection_train_loader
from detectron2.data import transforms as T
from detectron2.data import detection_utils as utils
def mapper(dataset_dict):
# Implement a mapper, similar to the default DatasetMapper, but with your own customizations
dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
image = utils.read_image(dataset_dict["file_name"], format="BGR")
image, transforms = T.apply_transform_gens([T.Resize((800, 800))], image)
dataset_dict["image"] = torch.as_tensor(image.transpose(2, 0, 1).astype("float32"))
annos = [
utils.transform_instance_annotations(obj, transforms, image.shape[:2])
for obj in dataset_dict.pop("annotations")
if obj.get("iscrowd", 0) == 0
]
instances = utils.annotations_to_instances(annos, image.shape[:2])
dataset_dict["instances"] = utils.filter_empty_instances(instances)
return dataset_dict
data_loader = build_detection_train_loader(cfg, mapper=mapper)
# use this dataloader instead of the default
```
Refer to [API documentation of detectron2.data](../modules/data) for details.
If you want to change not only the mapper (e.g., to write different sampling or batching logic),
you can write your own data loader. The data loader is simply a
python iterator that produces [the format](./models.md) your model accepts.
You can implement it using any tools you like.
## Use a Custom Dataloader
If you use [DefaultTrainer](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer),
you can overwrite its `build_{train,test}_loader` method to use your own dataloader.
See the [densepose dataloader](../../projects/DensePose/train_net.py)
for an example.
If you write your own training loop, you can plug in your data loader easily.
# Use Custom Datasets
Datasets that have builtin support in detectron2 are listed in [datasets](../../datasets).
If you want to use a custom dataset while also reusing detectron2's data loaders,
you will need to
1. __Register__ your dataset (i.e., tell detectron2 how to obtain your dataset).
2. Optionally, __register metadata__ for your dataset.
Next, we explain the above two concepts in detail.
The [Colab tutorial](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
has a live example of how to register and train on a dataset of custom formats.
### Register a Dataset
To let detectron2 know how to obtain a dataset named "my_dataset", you will implement
a function that returns the items in your dataset and then tell detectron2 about this
function:
```python
def my_dataset_function():
...
return list[dict] in the following format
from detectron2.data import DatasetCatalog
DatasetCatalog.register("my_dataset", my_dataset_function)
```
Here, the snippet associates a dataset "my_dataset" with a function that returns the data.
The registration stays effective until the process exists.
The function can processes data from its original format into either one of the following:
1. Detectron2's standard dataset dict, described below. This will work with many other builtin
features in detectron2, so it's recommended to use it when it's sufficient for your task.
2. Your custom dataset dict. You can also return arbitrary dicts in your own format,
such as adding extra keys for new tasks.
Then you will need to handle them properly downstream as well.
See below for more details.
#### Standard Dataset Dicts
For standard tasks
(instance detection, instance/semantic/panoptic segmentation, keypoint detection),
we load the original dataset into `list[dict]` with a specification similar to COCO's json annotations.
This is our standard representation for a dataset.
Each dict contains information about one image.
The dict may have the following fields,
and the required fields vary based on what the dataloader or the task needs (see more below).
+ `file_name`: the full path to the image file. Will apply rotation and flipping if the image has such exif information.
+ `height`, `width`: integer. The shape of image.
+ `image_id` (str or int): a unique id that identifies this image. Used
during evaluation to identify the images, but a dataset may use it for different purposes.
+ `annotations` (list[dict]): each dict corresponds to annotations of one instance
in this image. Required by instance detection/segmentation or keypoint detection tasks.
Images with empty `annotations` will by default be removed from training,
but can be included using `DATALOADER.FILTER_EMPTY_ANNOTATIONS`.
Each dict contains the following keys, of which `bbox`,`bbox_mode` and `category_id` are required:
+ `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
+ `bbox_mode` (int): the format of bbox.
It must be a member of
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
+ `category_id` (int): an integer in the range [0, num_categories) representing the category label.
The value num_categories is reserved to represent the "background" category, if applicable.
+ `segmentation` (list[list[float]] or dict): the segmentation mask of the instance.
+ If `list[list[float]]`, it represents a list of polygons, one for each connected component
of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
depend on whether "bbox_mode" is relative.
+ If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format. The dict should have
keys "size" and "counts". You can convert a uint8 segmentation mask of 0s and 1s into
RLE format by `pycocotools.mask.encode(np.asarray(mask, order="F"))`.
+ `keypoints` (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
v[i] means the [visibility](http://cocodataset.org/#format-data) of this keypoint.
`n` must be equal to the number of keypoint categories.
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
depend on whether "bbox_mode" is relative.
Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
pixel indices to floating point coordinates.
+ `iscrowd`: 0 (default) or 1. Whether this instance is labeled as COCO's "crowd
region". Don't include this field if you don't know what it means.
+ `sem_seg_file_name`: the full path to the ground truth semantic segmentation file.
Required by semantic segmentation task.
It should be an image whose pixel values are integer labels.
Fast R-CNN (with precomputed proposals) is rarely used today.
To train a Fast R-CNN, the following extra keys are needed:
+ `proposal_boxes` (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.
+ `proposal_objectness_logits` (array): numpy array with shape (K, ), which corresponds to the objectness
logits of proposals in 'proposal_boxes'.
+ `proposal_bbox_mode` (int): the format of the precomputed proposal bbox.
It must be a member of
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
Default is `BoxMode.XYXY_ABS`.
#### Custom Dataset Dicts for New Tasks
In the `list[dict]` that your dataset function returns, the dictionary can also have arbitrary custom data.
This will be useful for a new task that needs extra information not supported
by the standard dataset dicts. In this case, you need to make sure the downstream code can handle your data
correctly. Usually this requires writing a new `mapper` for the dataloader (see [Use Custom Dataloaders](./data_loading.md)).
When designing a custom format, note that all dicts are stored in memory
(sometimes serialized and with multiple copies).
To save memory, each dict is meant to contain small but sufficient information
about each sample, such as file names and annotations.
Loading full samples typically happens in the data loader.
For attributes shared among the entire dataset, use `Metadata` (see below).
To avoid extra memory, do not save such information repeatly for each sample.
### "Metadata" for Datasets
Each dataset is associated with some metadata, accessible through
`MetadataCatalog.get(dataset_name).some_metadata`.
Metadata is a key-value mapping that contains information that's shared among
the entire dataset, and usually is used to interpret what's in the dataset, e.g.,
names of classes, colors of classes, root of files, etc.
This information will be useful for augmentation, evaluation, visualization, logging, etc.
The structure of metadata depends on the what is needed from the corresponding downstream code.
If you register a new dataset through `DatasetCatalog.register`,
you may also want to add its corresponding metadata through
`MetadataCatalog.get(dataset_name).some_key = some_value`, to enable any features that need the metadata.
You can do it like this (using the metadata key "thing_classes" as an example):
```python
from detectron2.data import MetadataCatalog
MetadataCatalog.get("my_dataset").thing_classes = ["person", "dog"]
```
Here is a list of metadata keys that are used by builtin features in detectron2.
If you add your own dataset without these metadata, some features may be
unavailable to you:
* `thing_classes` (list[str]): Used by all instance detection/segmentation tasks.
A list of names for each instance/thing category.
If you load a COCO format dataset, it will be automatically set by the function `load_coco_json`.
* `thing_colors` (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each thing category.
Used for visualization. If not given, random colors are used.
* `stuff_classes` (list[str]): Used by semantic and panoptic segmentation tasks.
A list of names for each stuff category.
* `stuff_colors` (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each stuff category.
Used for visualization. If not given, random colors are used.
* `keypoint_names` (list[str]): Used by keypoint localization. A list of names for each keypoint.
* `keypoint_flip_map` (list[tuple[str]]): Used by the keypoint localization task. A list of pairs of names,
where each pair are the two keypoints that should be flipped if the image is
flipped horizontally during augmentation.
* `keypoint_connection_rules`: list[tuple(str, str, (r, g, b))]. Each tuple specifies a pair of keypoints
that are connected and the color to use for the line between them when visualized.
Some additional metadata that are specific to the evaluation of certain datasets (e.g. COCO):
* `thing_dataset_id_to_contiguous_id` (dict[int->int]): Used by all instance detection/segmentation tasks in the COCO format.
A mapping from instance class ids in the dataset to contiguous ids in range [0, #class).
Will be automatically set by the function `load_coco_json`.
* `stuff_dataset_id_to_contiguous_id` (dict[int->int]): Used when generating prediction json files for
semantic/panoptic segmentation.
A mapping from semantic segmentation class ids in the dataset
to contiguous ids in [0, num_categories). It is useful for evaluation only.
* `json_file`: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.
* `panoptic_root`, `panoptic_json`: Used by panoptic evaluation.
* `evaluator_type`: Used by the builtin main training script to select
evaluator. Don't use it in a new training script.
You can just provide the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
for your dataset directly in your main script.
NOTE: For background on the concept of "thing" and "stuff", see
[On Seeing Stuff: The Perception of Materials by Humans and Machines](http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf).
In detectron2, the term "thing" is used for instance-level tasks,
and "stuff" is used for semantic segmentation tasks.
Both are used in panoptic segmentation.
### Register a COCO Format Dataset
If your dataset is already a json file in the COCO format,
the dataset and its associated metadata can be registered easily with:
```python
from detectron2.data.datasets import register_coco_instances
register_coco_instances("my_dataset", {}, "json_annotation.json", "path/to/image/dir")
```
If your dataset is in COCO format but with extra custom per-instance annotations,
the [load_coco_json](../modules/data.html#detectron2.data.datasets.load_coco_json)
function might be useful.
### Update the Config for New Datasets
Once you've registered the dataset, you can use the name of the dataset (e.g., "my_dataset" in
example above) in `cfg.DATASETS.{TRAIN,TEST}`.
There are other configs you might want to change to train or evaluate on new datasets:
* `MODEL.ROI_HEADS.NUM_CLASSES` and `MODEL.RETINANET.NUM_CLASSES` are the number of thing classes
for R-CNN and RetinaNet models, respectively.
* `MODEL.ROI_KEYPOINT_HEAD.NUM_KEYPOINTS` sets the number of keypoints for Keypoint R-CNN.
You'll also need to set [Keypoint OKS](http://cocodataset.org/#keypoints-eval)
with `TEST.KEYPOINT_OKS_SIGMAS` for evaluation.
* `MODEL.SEM_SEG_HEAD.NUM_CLASSES` sets the number of stuff classes for Semantic FPN & Panoptic FPN.
* If you're training Fast R-CNN (with precomputed proposals), `DATASETS.PROPOSAL_FILES_{TRAIN,TEST}`
need to match the datasets. The format of proposal files are documented
[here](../modules/data.html#detectron2.data.load_proposals_into_dataset).
New models
(e.g. [TensorMask](../../projects/TensorMask),
[PointRend](../../projects/PointRend))
often have similar configs of their own that need to be changed as well.
# Deployment
## Caffe2 Deployment
We currently support converting a detectron2 model to Caffe2 format through ONNX.
The converted Caffe2 model is able to run without detectron2 dependency in either Python or C++.
It has a runtime optimized for CPU & mobile inference, but not for GPU inference.
Caffe2 conversion requires PyTorch ≥ 1.4 and ONNX ≥ 1.6.
### Coverage
It supports 3 most common meta architectures: `GeneralizedRCNN`, `RetinaNet`, `PanopticFPN`,
and most official models under these 3 meta architectures.
Users' custom extensions under these architectures (added through registration) are supported
as long as they do not contain control flow or operators not available in Caffe2 (e.g. deformable convolution).
For example, custom backbones and heads are often supported out of the box.
### Usage
The conversion APIs are documented at [the API documentation](../modules/export).
We provide a tool, `caffe2_converter.py` as an example that uses
these APIs to convert a standard model.
To convert an official Mask R-CNN trained on COCO, first
[prepare the COCO dataset](../../datasets/), then pick the model from [Model Zoo](../../MODEL_ZOO.md), and run:
```
cd tools/deploy/ && ./caffe2_converter.py --config-file ../../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
--output ./caffe2_model --run-eval \
MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl \
MODEL.DEVICE cpu
```
Note that:
1. The conversion needs valid sample inputs & weights to trace the model. That's why the script requires the dataset.
You can modify the script to obtain sample inputs in other ways.
2. With the `--run-eval` flag, it will evaluate the converted models to verify its accuracy.
The accuracy is typically slightly different (within 0.1 AP) from PyTorch due to
numerical precisions between different implementations.
It's recommended to always verify the accuracy in case your custom model is not supported by the
conversion.
The converted model is available at the specified `caffe2_model/` directory. Two files `model.pb`
and `model_init.pb` that contain network structure and network parameters are necessary for deployment.
These files can then be loaded in C++ or Python using Caffe2's APIs.
The script generates `model.svg` file which contains a visualization of the network.
You can also load `model.pb` to tools such as [netron](https://github.com/lutzroeder/netron) to visualize it.
### Use the model in C++/Python
The model can be loaded in C++. An example [caffe2_mask_rcnn.cpp](../../tools/deploy/) is given,
which performs CPU/GPU inference using `COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x`.
The C++ example needs to be built with:
* PyTorch with caffe2 inside
* gflags, glog, opencv
* protobuf headers that match the version of your caffe2
* MKL headers if caffe2 is built with MKL
The following can compile the example inside [official detectron2 docker](../../docker/):
```
sudo apt update && sudo apt install libgflags-dev libgoogle-glog-dev libopencv-dev
pip install mkl-include
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protobuf-cpp-3.6.1.tar.gz
tar xf protobuf-cpp-3.6.1.tar.gz
export CPATH=$(readlink -f ./protobuf-3.6.1/src/):$HOME/.local/include
export CMAKE_PREFIX_PATH=$HOME/.local/lib/python3.6/site-packages/torch/
mkdir build && cd build
cmake -DTORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST .. && make
# To run:
./caffe2_mask_rcnn --predict_net=./model.pb --init_net=./model_init.pb --input=input.jpg
```
Note that:
* All converted models (the .pb files) take two input tensors:
"data" is an NCHW image, and "im_info" is an Nx3 tensor consisting of (height, width, 1.0) for
each image (the shape of "data" might be larger than that in "im_info" due to padding).
* The converted models do not contain post-processing operations that
transform raw layer outputs into formatted predictions.
The example only produces raw outputs (28x28 masks) from the final
layers that are not post-processed, because in actual deployment, an application often needs
its custom lightweight post-processing (e.g. full-image masks for every detected object is often not necessary).
We also provide a python wrapper around the converted model, in the
[Caffe2Model.\_\_call\_\_](../modules/export.html#detectron2.export.Caffe2Model.__call__) method.
This method has an interface that's identical to the [pytorch versions of models](./models.md),
and it internally applies pre/post-processing code to match the formats.
They can serve as a reference for pre/post-processing in actual deployment.
# Evaluation
Evaluation is a process that takes a number of inputs/outputs pairs and aggregate them.
You can always [use the model](./models.md) directly and just parse its inputs/outputs manually to perform
evaluation.
Alternatively, evaluation is implemented in detectron2 using the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
interface.
Detectron2 includes a few `DatasetEvaluator` that computes metrics using standard dataset-specific
APIs (e.g., COCO, LVIS).
You can also implement your own `DatasetEvaluator` that performs some other jobs
using the inputs/outputs pairs.
For example, to count how many instances are detected on the validation set:
```
class Counter(DatasetEvaluator):
def reset(self):
self.count = 0
def process(self, inputs, outputs):
for output in outputs:
self.count += len(output["instances"])
def evaluate(self):
# save self.count somewhere, or print it, or return it.
return {"count": self.count}
```
Once you have some `DatasetEvaluator`, you can run it with
[inference_on_dataset](../modules/evaluation.html#detectron2.evaluation.inference_on_dataset).
For example,
```python
val_results = inference_on_dataset(
model,
val_data_loader,
DatasetEvaluators([COCOEvaluator(...), Counter()]))
```
Compared to running the evaluation manually using the model, the benefit of this function is that
you can merge evaluators together using [DatasetEvaluators](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluators).
In this way you can run all evaluations without having to go through the dataset multiple times.
The `inference_on_dataset` function also provides accurate speed benchmarks for the
given model and dataset.
# Extend Detectron2's Defaults
__Research is about doing things in new ways__.
This brings a tension in how to create abstractions in code,
which is a challenge for any research engineering project of a significant size:
1. On one hand, it needs to have very thin abstractions to allow for the possibility of doing
everything in new ways. It should be reasonably easy to break existing
abstractions and replace them with new ones.
2. On the other hand, such a project also needs reasonably high-level
abstractions, so that users can easily do things in standard ways,
without worrying too much about the details that only certain researchers care about.
In detectron2, there are two types of interfaces that address this tension together:
1. Functions and classes that take a config (`cfg`) argument
(sometimes with only a few extra arguments).
Such functions and classes implement
the "standard default" behavior: it will read what it needs from the
config and do the "standard" thing.
Users only need to load a given config and pass it around, without having to worry about
which arguments are used and what they all mean.
2. Functions and classes that have well-defined explicit arguments.
Each of these is a small building block of the entire system.
They require users' expertise to understand what each argument should be,
and require more effort to stitch together to a larger system.
But they can be stitched together in more flexible ways.
When you need to implement something not supported by the "standard defaults"
included in detectron2, these well-defined components can be reused.
3. (experimental) A few classes are implemented with the
[@configurable](../../modules/config.html#detectron2.config.configurable)
decorator - they can be called with either a config, or with explicit arguments.
Their explicit argument interfaces are currently __experimental__ and subject to change.
If you only need the standard behavior, the [Beginner's Tutorial](./getting_started.md)
should suffice. If you need to extend detectron2 to your own needs,
see the following tutorials for more details:
* Detectron2 includes a few standard datasets. To use custom ones, see
[Use Custom Datasets](./datasets.md).
* Detectron2 contains the standard logic that creates a data loader for training/testing from a
dataset, but you can write your own as well. See [Use Custom Data Loaders](./data_loading.md).
* Detectron2 implements many standard detection models, and provide ways for you
to overwrite their behaviors. See [Use Models](./models.md) and [Write Models](./write-models.md).
* Detectron2 provides a default training loop that is good for common training tasks.
You can customize it with hooks, or write your own loop instead. See [training](./training.md).
../../GETTING_STARTED.md
\ No newline at end of file
Tutorials
======================================
.. toctree::
:maxdepth: 2
install
getting_started
builtin_datasets
extend
datasets
data_loading
models
write-models
training
evaluation
configs
deployment
../../INSTALL.md
\ No newline at end of file
# Use Models
Models (and their sub-models) in detectron2 are built by
functions such as `build_model`, `build_backbone`, `build_roi_heads`:
```python
from detectron2.modeling import build_model
model = build_model(cfg) # returns a torch.nn.Module
```
`build_model` only builds the model structure, and fill it with random parameters.
See below for how to load an existing checkpoint to the model,
and how to use the `model` object.
### Load/Save a Checkpoint
```python
from detectron2.checkpoint import DetectionCheckpointer
DetectionCheckpointer(model).load(file_path) # load a file to model
checkpointer = DetectionCheckpointer(model, save_dir="output")
checkpointer.save("model_999") # save to output/model_999.pth
```
Detectron2's checkpointer recognizes models in pytorch's `.pth` format, as well as the `.pkl` files
in our model zoo.
See [API doc](../modules/checkpoint.html#detectron2.checkpoint.DetectionCheckpointer)
for more details about its usage.
The model files can be arbitrarily manipulated using `torch.{load,save}` for `.pth` files or
`pickle.{dump,load}` for `.pkl` files.
### Use a Model
A model can be called by `outputs = model(inputs)`, where `inputs` is a `list[dict]`.
Each dict corresponds to one image and the required keys
depend on the type of model, and whether the model is in training or evaluation mode.
For example, in order to do inference,
all existing models expect the "image" key, and optionally "height" and "width".
The detailed format of inputs and outputs of existing models are explained below.
When in training mode, all models are required to be used under an `EventStorage`.
The training statistics will be put into the storage:
```python
from detectron2.utils.events import EventStorage
with EventStorage() as storage:
losses = model(inputs)
```
If you only want to do simple inference using an existing model,
[DefaultPredictor](../modules/engine.html#detectron2.engine.defaults.DefaultPredictor)
is a wrapper around model that provides such basic functionality.
It includes default behavior including model loading, preprocessing,
and operates on single image rather than batches.
### Model Input Format
Users can implement custom models that support any arbitrary input format.
Here we describe the standard input format that all builtin models support in detectron2.
They all take a `list[dict]` as the inputs. Each dict
corresponds to information about one image.
The dict may contain the following keys:
* "image": `Tensor` in (C, H, W) format. The meaning of channels are defined by `cfg.INPUT.FORMAT`.
Image normalization, if any, will be performed inside the model using
`cfg.MODEL.PIXEL_{MEAN,STD}`.
* "instances": an [Instances](../modules/structures.html#detectron2.structures.Instances)
object, with the following fields:
+ "gt_boxes": a [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing N boxes, one for each instance.
+ "gt_classes": `Tensor` of long type, a vector of N labels, in range [0, num_categories).
+ "gt_masks": a [PolygonMasks](../modules/structures.html#detectron2.structures.PolygonMasks)
or [BitMasks](../modules/structures.html#detectron2.structures.BitMasks) object storing N masks, one for each instance.
+ "gt_keypoints": a [Keypoints](../modules/structures.html#detectron2.structures.Keypoints)
object storing N keypoint sets, one for each instance.
* "proposals": an [Instances](../modules/structures.html#detectron2.structures.Instances)
object used only in Fast R-CNN style models, with the following fields:
+ "proposal_boxes": a [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing P proposal boxes.
+ "objectness_logits": `Tensor`, a vector of P scores, one for each proposal.
* "height", "width": the **desired** output height and width, which is not necessarily the same
as the height or width of the `image` input field.
For example, the `image` input field might be a resized image,
but you may want the outputs to be in **original** resolution.
If provided, the model will produce output in this resolution,
rather than in the resolution of the `image` as input into the model. This is more efficient and accurate.
* "sem_seg": `Tensor[int]` in (H, W) format. The semantic segmentation ground truth.
Values represent category labels starting from 0.
#### How it connects to data loader:
The output of the default [DatasetMapper]( ../modules/data.html#detectron2.data.DatasetMapper) is a dict
that follows the above format.
After the data loader performs batching, it becomes `list[dict]` which the builtin models support.
### Model Output Format
When in training mode, the builtin models output a `dict[str->ScalarTensor]` with all the losses.
When in inference mode, the builtin models output a `list[dict]`, one dict for each image.
Based on the tasks the model is doing, each dict may contain the following fields:
* "instances": [Instances](../modules/structures.html#detectron2.structures.Instances)
object with the following fields:
* "pred_boxes": [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing N boxes, one for each detected instance.
* "scores": `Tensor`, a vector of N scores.
* "pred_classes": `Tensor`, a vector of N labels in range [0, num_categories).
+ "pred_masks": a `Tensor` of shape (N, H, W), masks for each detected instance.
+ "pred_keypoints": a `Tensor` of shape (N, num_keypoint, 3).
Each row in the last dimension is (x, y, score). Scores are larger than 0.
* "sem_seg": `Tensor` of (num_categories, H, W), the semantic segmentation prediction.
* "proposals": [Instances](../modules/structures.html#detectron2.structures.Instances)
object with the following fields:
* "proposal_boxes": [Boxes](../modules/structures.html#detectron2.structures.Boxes)
object storing N boxes.
* "objectness_logits": a torch vector of N scores.
* "panoptic_seg": A tuple of `(Tensor, list[dict])`. The tensor has shape (H, W), where each element
represent the segment id of the pixel. Each dict describes one segment id and has the following fields:
* "id": the segment id
* "isthing": whether the segment is a thing or stuff
* "category_id": the category id of this segment. It represents the thing
class id when `isthing==True`, and the stuff class id otherwise.
### Partially execute a model:
Sometimes you may want to obtain an intermediate tensor inside a model.
Since there are typically hundreds of intermediate tensors, there isn't an API that provides you
the intermediate result you need.
You have the following options:
1. Write a (sub)model. Following the [tutorial](./write-models.md), you can
rewrite a model component (e.g. a head of a model), such that it
does the same thing as the existing component, but returns the output
you need.
2. Partially execute a model. You can create the model as usual,
but use custom code to execute it instead of its `forward()`. For example,
the following code obtains mask features before mask head.
```python
images = ImageList.from_tensors(...) # preprocessed input tensor
model = build_model(cfg)
features = model.backbone(images.tensor)
proposals, _ = model.proposal_generator(images, features)
instances = model.roi_heads._forward_box(features, proposals)
mask_features = [features[f] for f in model.roi_heads.in_features]
mask_features = model.roi_heads.mask_pooler(mask_features, [x.pred_boxes for x in instances])
```
Note that both options require you to read the existing forward code to understand
how to write code to obtain the outputs you need.
# Training
From the previous tutorials, you may now have a custom model and data loader.
You are free to create your own optimizer, and write the training logic: it's
usually easy with PyTorch, and allow researchers to see the entire training
logic more clearly and have full control.
One such example is provided in [tools/plain_train_net.py](../../tools/plain_train_net.py).
We also provide a standarized "trainer" abstraction with a
[minimal hook system](../modules/engine.html#detectron2.engine.HookBase)
that helps simplify the standard types of training.
You can use
[SimpleTrainer().train()](../modules/engine.html#detectron2.engine.SimpleTrainer)
which provides minimal abstraction for single-cost single-optimizer single-data-source training.
The builtin `train_net.py` script uses
[DefaultTrainer().train()](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer),
which includes more standard default behavior that one might want to opt in,
including default configurations for learning rate schedule,
logging, evaluation, checkpointing etc.
This also means that it's less likely to support some non-standard behavior
you might want during research.
To customize the training loops, you can:
1. If your customization is similar to what `DefaultTrainer` is already doing,
you can change behavior of `DefaultTrainer` by overwriting [its methods](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer)
in a subclass, like what [tools/train_net.py](../../tools/train_net.py) does.
2. If you need something very novel, you can start from [tools/plain_train_net.py](../../tools/plain_train_net.py) to implement them yourself.
### Logging of Metrics
During training, metrics are saved to a centralized [EventStorage](../modules/utils.html#detectron2.utils.events.EventStorage).
You can use the following code to access it and log metrics to it:
```
from detectron2.utils.events import get_event_storage
# inside the model:
if self.training:
value = # compute the value from inputs
storage = get_event_storage()
storage.put_scalar("some_accuracy", value)
```
Refer to its documentation for more details.
Metrics are then saved to various destinations with [EventWriter](../modules/utils.html#module-detectron2.utils.events).
DefaultTrainer enables a few `EventWriter` with default configurations.
See above for how to customize them.
# Write Models
If you are trying to do something completely new, you may wish to implement
a model entirely from scratch within detectron2. However, in many situations you may
be interested in modifying or extending some components of an existing model.
Therefore, we also provide a registration mechanism that lets you override the
behavior of certain internal components of standard models.
For example, to add a new backbone, import this code in your code:
```python
from detectron2.modeling import BACKBONE_REGISTRY, Backbone, ShapeSpec
@BACKBONE_REGISTRY.register()
class ToyBackBone(Backbone):
def __init__(self, cfg, input_shape):
# create your own backbone
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=16, padding=3)
def forward(self, image):
return {"conv1": self.conv1(image)}
def output_shape(self):
return {"conv1": ShapeSpec(channels=64, stride=16)}
```
Then, you can use `cfg.MODEL.BACKBONE.NAME = 'ToyBackBone'` in your config object.
`build_model(cfg)` will then call your `ToyBackBone` instead.
As another example, to add new abilities to the ROI heads in the Generalized R-CNN meta-architecture,
you can implement a new
[ROIHeads](../modules/modeling.html#detectron2.modeling.ROIHeads) subclass and put it in the `ROI_HEADS_REGISTRY`.
See [densepose in detectron2](../../projects/DensePose)
and [meshrcnn](https://github.com/facebookresearch/meshrcnn)
for examples that implement new ROIHeads to perform new tasks.
And [projects/](../../projects/)
contains more examples that implement different architectures.
A complete list of registries can be found in [API documentation](../modules/modeling.html#model-registries).
You can register components in these registries to customize different parts of a model, or the
entire model.
# DensePose in Detectron2
**Dense Human Pose Estimation In The Wild**
_Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos_
[[`densepose.org`](https://densepose.org)] [[`arXiv`](https://arxiv.org/abs/1802.00434)] [[`BibTeX`](#CitingDensePose)]
Dense human pose estimation aims at mapping all human pixels of an RGB image to the 3D surface of the human body.
<div align="center">
<img src="https://drive.google.com/uc?export=view&id=1qfSOkpueo1kVZbXOuQJJhyagKjMgepsz" width="700px" />
</div>
In this repository, we provide the code to train and evaluate DensePose-RCNN. We also provide tools to visualize
DensePose annotation and results.
# Quick Start
See [ Getting Started ](doc/GETTING_STARTED.md)
# Model Zoo and Baselines
We provide a number of baseline results and trained models available for download. See [Model Zoo](doc/MODEL_ZOO.md) for details.
# License
Detectron2 is released under the [Apache 2.0 license](../../LICENSE)
## <a name="CitingDensePose"></a>Citing DensePose
If you use DensePose, please take the references from the following BibTeX entries:
For DensePose with estimated confidences:
```
@InProceedings{Neverova2019DensePoseConfidences,
title = {Correlated Uncertainty for Learning Dense Correspondences from Noisy Labels},
author = {Neverova, Natalia and Novotny, David and Vedaldi, Andrea},
journal = {Advances in Neural Information Processing Systems},
year = {2019},
}
```
For the original DensePose:
```
@InProceedings{Guler2018DensePose,
title={DensePose: Dense Human Pose Estimation In The Wild},
author={R\{i}za Alp G\"uler, Natalia Neverova, Iasonas Kokkinos},
journal={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2018}
}
```
#!/usr/bin/env python3
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
import argparse
import glob
import logging
import os
import pickle
import sys
from typing import Any, ClassVar, Dict, List
import torch
from detectron2.config import get_cfg
from detectron2.data.detection_utils import read_image
from detectron2.engine.defaults import DefaultPredictor
from detectron2.structures.boxes import BoxMode
from detectron2.structures.instances import Instances
from detectron2.utils.logger import setup_logger
from densepose import add_densepose_config
from densepose.utils.logger import verbosity_to_level
from densepose.vis.base import CompoundVisualizer
from densepose.vis.bounding_box import ScoredBoundingBoxVisualizer
from densepose.vis.densepose import (
DensePoseResultsContourVisualizer,
DensePoseResultsFineSegmentationVisualizer,
DensePoseResultsUVisualizer,
DensePoseResultsVVisualizer,
)
from densepose.vis.extractor import CompoundExtractor, create_extractor
DOC = """Apply Net - a tool to print / visualize DensePose results
"""
LOGGER_NAME = "apply_net"
logger = logging.getLogger(LOGGER_NAME)
_ACTION_REGISTRY: Dict[str, "Action"] = {}
class Action(object):
@classmethod
def add_arguments(cls: type, parser: argparse.ArgumentParser):
parser.add_argument(
"-v",
"--verbosity",
action="count",
help="Verbose mode. Multiple -v options increase the verbosity.",
)
def register_action(cls: type):
"""
Decorator for action classes to automate action registration
"""
global _ACTION_REGISTRY
_ACTION_REGISTRY[cls.COMMAND] = cls
return cls
class InferenceAction(Action):
@classmethod
def add_arguments(cls: type, parser: argparse.ArgumentParser):
super(InferenceAction, cls).add_arguments(parser)
parser.add_argument("cfg", metavar="<config>", help="Config file")
parser.add_argument("model", metavar="<model>", help="Model file")
parser.add_argument("input", metavar="<input>", help="Input data")
parser.add_argument(
"--opts",
help="Modify config options using the command-line 'KEY VALUE' pairs",
default=[],
nargs=argparse.REMAINDER,
)
@classmethod
def execute(cls: type, args: argparse.Namespace):
logger.info(f"Loading config from {args.cfg}")
opts = []
cfg = cls.setup_config(args.cfg, args.model, args, opts)
logger.info(f"Loading model from {args.model}")
predictor = DefaultPredictor(cfg)
logger.info(f"Loading data from {args.input}")
file_list = cls._get_input_file_list(args.input)
if len(file_list) == 0:
logger.warning(f"No input images for {args.input}")
return
context = cls.create_context(args)
for file_name in file_list:
img = read_image(file_name, format="BGR") # predictor expects BGR image.
with torch.no_grad():
outputs = predictor(img)["instances"]
cls.execute_on_outputs(context, {"file_name": file_name, "image": img}, outputs)
cls.postexecute(context)
@classmethod
def setup_config(
cls: type, config_fpath: str, model_fpath: str, args: argparse.Namespace, opts: List[str]
):
cfg = get_cfg()
add_densepose_config(cfg)
cfg.merge_from_file(config_fpath)
cfg.merge_from_list(args.opts)
if opts:
cfg.merge_from_list(opts)
cfg.MODEL.WEIGHTS = model_fpath
cfg.freeze()
return cfg
@classmethod
def _get_input_file_list(cls: type, input_spec: str):
if os.path.isdir(input_spec):
file_list = [
os.path.join(input_spec, fname)
for fname in os.listdir(input_spec)
if os.path.isfile(os.path.join(input_spec, fname))
]
elif os.path.isfile(input_spec):
file_list = [input_spec]
else:
file_list = glob.glob(input_spec)
return file_list
@register_action
class DumpAction(InferenceAction):
"""
Dump action that outputs results to a pickle file
"""
COMMAND: ClassVar[str] = "dump"
@classmethod
def add_parser(cls: type, subparsers: argparse._SubParsersAction):
parser = subparsers.add_parser(cls.COMMAND, help="Dump model outputs to a file.")
cls.add_arguments(parser)
parser.set_defaults(func=cls.execute)
@classmethod
def add_arguments(cls: type, parser: argparse.ArgumentParser):
super(DumpAction, cls).add_arguments(parser)
parser.add_argument(
"--output",
metavar="<dump_file>",
default="results.pkl",
help="File name to save dump to",
)
@classmethod
def execute_on_outputs(
cls: type, context: Dict[str, Any], entry: Dict[str, Any], outputs: Instances
):
image_fpath = entry["file_name"]
logger.info(f"Processing {image_fpath}")
result = {"file_name": image_fpath}
if outputs.has("scores"):
result["scores"] = outputs.get("scores").cpu()
if outputs.has("pred_boxes"):
result["pred_boxes_XYXY"] = outputs.get("pred_boxes").tensor.cpu()
if outputs.has("pred_densepose"):
boxes_XYWH = BoxMode.convert(
result["pred_boxes_XYXY"], BoxMode.XYXY_ABS, BoxMode.XYWH_ABS
)
result["pred_densepose"] = outputs.get("pred_densepose").to_result(boxes_XYWH)
context["results"].append(result)
@classmethod
def create_context(cls: type, args: argparse.Namespace):
context = {"results": [], "out_fname": args.output}
return context
@classmethod
def postexecute(cls: type, context: Dict[str, Any]):
out_fname = context["out_fname"]
out_dir = os.path.dirname(out_fname)
if len(out_dir) > 0 and not os.path.exists(out_dir):
os.makedirs(out_dir)
with open(out_fname, "wb") as hFile:
pickle.dump(context["results"], hFile)
logger.info(f"Output saved to {out_fname}")
@register_action
class ShowAction(InferenceAction):
"""
Show action that visualizes selected entries on an image
"""
COMMAND: ClassVar[str] = "show"
VISUALIZERS: ClassVar[Dict[str, object]] = {
"dp_contour": DensePoseResultsContourVisualizer,
"dp_segm": DensePoseResultsFineSegmentationVisualizer,
"dp_u": DensePoseResultsUVisualizer,
"dp_v": DensePoseResultsVVisualizer,
"bbox": ScoredBoundingBoxVisualizer,
}
@classmethod
def add_parser(cls: type, subparsers: argparse._SubParsersAction):
parser = subparsers.add_parser(cls.COMMAND, help="Visualize selected entries")
cls.add_arguments(parser)
parser.set_defaults(func=cls.execute)
@classmethod
def add_arguments(cls: type, parser: argparse.ArgumentParser):
super(ShowAction, cls).add_arguments(parser)
parser.add_argument(
"visualizations",
metavar="<visualizations>",
help="Comma separated list of visualizations, possible values: "
"[{}]".format(",".join(sorted(cls.VISUALIZERS.keys()))),
)
parser.add_argument(
"--min_score",
metavar="<score>",
default=0.8,
type=float,
help="Minimum detection score to visualize",
)
parser.add_argument(
"--nms_thresh", metavar="<threshold>", default=None, type=float, help="NMS threshold"
)
parser.add_argument(
"--output",
metavar="<image_file>",
default="outputres.png",
help="File name to save output to",
)
@classmethod
def setup_config(
cls: type, config_fpath: str, model_fpath: str, args: argparse.Namespace, opts: List[str]
):
opts.append("MODEL.ROI_HEADS.SCORE_THRESH_TEST")
opts.append(str(args.min_score))
if args.nms_thresh is not None:
opts.append("MODEL.ROI_HEADS.NMS_THRESH_TEST")
opts.append(str(args.nms_thresh))
cfg = super(ShowAction, cls).setup_config(config_fpath, model_fpath, args, opts)
return cfg
@classmethod
def execute_on_outputs(
cls: type, context: Dict[str, Any], entry: Dict[str, Any], outputs: Instances
):
import cv2
import numpy as np
visualizer = context["visualizer"]
extractor = context["extractor"]
image_fpath = entry["file_name"]
logger.info(f"Processing {image_fpath}")
image = cv2.cvtColor(entry["image"], cv2.COLOR_BGR2GRAY)
image = np.tile(image[:, :, np.newaxis], [1, 1, 3])
data = extractor(outputs)
image_vis = visualizer.visualize(image, data)
entry_idx = context["entry_idx"] + 1
out_fname = cls._get_out_fname(entry_idx, context["out_fname"])
out_dir = os.path.dirname(out_fname)
if len(out_dir) > 0 and not os.path.exists(out_dir):
os.makedirs(out_dir)
cv2.imwrite(out_fname, image_vis)
logger.info(f"Output saved to {out_fname}")
context["entry_idx"] += 1
@classmethod
def postexecute(cls: type, context: Dict[str, Any]):
pass
@classmethod
def _get_out_fname(cls: type, entry_idx: int, fname_base: str):
base, ext = os.path.splitext(fname_base)
return base + ".{0:04d}".format(entry_idx) + ext
@classmethod
def create_context(cls: type, args: argparse.Namespace) -> Dict[str, Any]:
vis_specs = args.visualizations.split(",")
visualizers = []
extractors = []
for vis_spec in vis_specs:
vis = cls.VISUALIZERS[vis_spec]()
visualizers.append(vis)
extractor = create_extractor(vis)
extractors.append(extractor)
visualizer = CompoundVisualizer(visualizers)
extractor = CompoundExtractor(extractors)
context = {
"extractor": extractor,
"visualizer": visualizer,
"out_fname": args.output,
"entry_idx": 0,
}
return context
def create_argument_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description=DOC,
formatter_class=lambda prog: argparse.HelpFormatter(prog, max_help_position=120),
)
parser.set_defaults(func=lambda _: parser.print_help(sys.stdout))
subparsers = parser.add_subparsers(title="Actions")
for _, action in _ACTION_REGISTRY.items():
action.add_parser(subparsers)
return parser
def main():
parser = create_argument_parser()
args = parser.parse_args()
verbosity = args.verbosity if hasattr(args, "verbosity") else None
global logger
logger = setup_logger(name=LOGGER_NAME)
logger.setLevel(verbosity_to_level(verbosity))
args.func(args)
if __name__ == "__main__":
main()
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
NAME: "build_resnet_fpn_backbone"
RESNETS:
OUT_FEATURES: ["res2", "res3", "res4", "res5"]
FPN:
IN_FEATURES: ["res2", "res3", "res4", "res5"]
ANCHOR_GENERATOR:
SIZES: [[32], [64], [128], [256], [512]] # One size for each in feature map
ASPECT_RATIOS: [[0.5, 1.0, 2.0]] # Three aspect ratios (same for all in feature maps)
RPN:
IN_FEATURES: ["p2", "p3", "p4", "p5", "p6"]
PRE_NMS_TOPK_TRAIN: 2000 # Per FPN level
PRE_NMS_TOPK_TEST: 1000 # Per FPN level
# Detectron1 uses 2000 proposals per-batch,
# (See "modeling/rpn/rpn_outputs.py" for details of this legacy issue)
# which is approximately 1000 proposals per-image since the default batch size for FPN is 2.
POST_NMS_TOPK_TRAIN: 1000
POST_NMS_TOPK_TEST: 1000
DENSEPOSE_ON: True
ROI_HEADS:
NAME: "DensePoseROIHeads"
IN_FEATURES: ["p2", "p3", "p4", "p5"]
NUM_CLASSES: 1
ROI_BOX_HEAD:
NAME: "FastRCNNConvFCHead"
NUM_FC: 2
POOLER_RESOLUTION: 7
POOLER_SAMPLING_RATIO: 2
POOLER_TYPE: "ROIAlign"
ROI_DENSEPOSE_HEAD:
NAME: "DensePoseV1ConvXHead"
POOLER_TYPE: "ROIAlign"
NUM_COARSE_SEGM_CHANNELS: 2
DATASETS:
TRAIN: ("densepose_coco_2014_train", "densepose_coco_2014_valminusminival")
TEST: ("densepose_coco_2014_minival",)
SOLVER:
IMS_PER_BATCH: 16
BASE_LR: 0.01
STEPS: (60000, 80000)
MAX_ITER: 90000
WARMUP_FACTOR: 0.1
INPUT:
MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment