Open source MnasFPN and minor fixes to OD API (#8484)

310447280 by lzc: Internal change 310420845 by Zhichao Lu: Open source the internal Context RCNN code. -- 310362339 by Zhichao Lu: Internal change 310259448 by lzc: Update required TF version for OD API. -- 310252159 by Zhichao Lu: Port patch_ops_test to TF1/TF2 as TPUs. -- 310247180 by Zhichao Lu: Ignore keypoint heatmap loss in the regions/bounding boxes with target keypoint class but no valid keypoint annotations. -- 310178294 by Zhichao Lu: Opensource MnasFPN https://arxiv.org/abs/1912.01106 -- 310094222 by lzc: Internal changes. -- 310085250 by lzc: Internal Change. -- 310016447 by huizhongc: Remove unrecognized classes from labeled_classes. -- 310009470 by rathodv: Mark batcher.py as TF1 only. -- 310001984 by rathodv: Update core/preprocessor.py to be compatible with TF1/TF2.. -- 309455035 by Zhichao Lu: Makes the freezable_batch_norm_test run w/ v2 behavior. The main change is in v2 updates will happen right away when running batchnorm in training mode. So, we need to restore the weights between batchnorm calls to make sure the numerical checks all start from the same place. -- 309425881 by Zhichao Lu: Make TF1/TF2 optimizer builder tests explicit. -- 309408646 by Zhichao Lu: Make dataset builder tests TF1 and TF2 compatible. -- 309246305 by Zhichao Lu: Added the functionality of combining the person keypoints and object detection annotations in the binary that converts the COCO raw data to TfRecord. -- 309125076 by Zhichao Lu: Convert target_assigner_utils to TF1/TF2. -- 308966359 by huizhongc: Support SSD training with partially labeled groundtruth. -- 308937159 by rathodv: Update core/target_assigner.py to be compatible with TF1/TF2. -- 308774302 by Zhichao Lu: Internal -- 308732860 by rathodv: Make core/prefetcher.py compatible with TF1 only. -- 308726984 by rathodv: Update core/multiclass_nms_test.py to be TF1/TF2 compatible. -- 308714718 by rathodv: Update core/region_similarity_calculator_test.py to be TF1/TF2 compatible. -- 308707960 by rathodv: Update core/minibatch_sampler_test.py to be TF1/TF2 compatible. -- 308700595 by rathodv: Update core/losses_test.py to be TF1/TF2 compatible and remove losses_test_v2.py -- 308361472 by rathodv: Update core/matcher_test.py to be TF1/TF2 compatible. -- 308335846 by Zhichao Lu: Updated the COCO evaluation logics and populated the groundturth area information through. This change matches the groundtruth format expected by the COCO keypoint evaluation. -- 308256924 by rathodv: Update core/keypoints_ops_test.py to be TF1/TF2 compatible. -- 308256826 by rathodv: Update class_agnostic_nms_test.py to be TF1/TF2 compatible. -- 308256112 by rathodv: Update box_list_ops_test.py to be TF1/TF2 compatible. -- 308159360 by Zhichao Lu: Internal change 308145008 by Zhichao Lu: Added 'image/class/confidence' field in the TFExample decoder. -- 307651875 by rathodv: Refactor core/box_list.py to support TF1/TF2. -- 307651798 by rathodv: Modify box_coder.py base class to work with with TF1/TF2 -- 307651652 by rathodv: Refactor core/balanced_positive_negative_sampler.py to support TF1/TF2. -- 307651571 by rathodv: Modify BoxCoders tests to use test_case:execute method to allow testing with TF1.X and TF2.X -- 307651480 by rathodv: Modify Matcher tests to use test_case:execute method to allow testing with TF1.X and TF2.X -- 307651409 by rathodv: Modify AnchorGenerator tests to use test_case:execute method to allow testing with TF1.X and TF2.X -- 307651314 by rathodv: Refactor model_builder to support TF1 or TF2 models based on TensorFlow version. -- 307092053 by Zhichao Lu: Use manager to save checkpoint. -- 307071352 by ronnyvotel: Fixing keypoint visibilities. Now by default, the visibility is marked True if the keypoint is labeled (regardless of whether it is visible or not). Also, if visibilities are not present in the dataset, they will be created based on whether the keypoint coordinates are finite (vis = True) or NaN (vis = False). -- 307069557 by Zhichao Lu: Internal change to add few fields related to postprocessing parameters in center_net.proto and populate those parameters to the keypoint postprocessing functions. -- 307012091 by Zhichao Lu: Make Adam Optimizer's epsilon proto configurable. Potential issue: tf.compat.v1's AdamOptimizer has a default epsilon on 1e-08 ([doc-link](https://www.tensorflow.org/api_docs/python/tf/compat/v1/train/AdamOptimizer)) whereas tf.keras's AdamOptimizer has default epsilon 1e-07 ([doc-link](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam)) -- 306858598 by Zhichao Lu: Internal changes to update the CenterNet model: 1) Modified eval job loss computation to avoid averaging over batches with zero loss. 2) Updated CenterNet keypoint heatmap target assigner to apply box size to heatmap Guassian standard deviation. 3) Updated the CenterNet meta arch keypoint losses computation to apply weights outside of loss function. -- 306731223 by jonathanhuang: Internal change. -- 306549183 by rathodv: Internal Update. -- 306542930 by rathodv: Internal Update -- 306322697 by rathodv: Internal. -- 305345036 by Zhichao Lu: Adding COCO Camera Traps Json to tf.Example beam code -- 304104869 by lzc: Internal changes. -- 304068971 by jonathanhuang: Internal change. -- 304050469 by Zhichao Lu: Internal change. -- 303880642 by huizhongc: Support parsing partially labeled groundtruth. -- 303841743 by Zhichao Lu: Deprecate nms_on_host in SSDMetaArch. -- 303803204 by rathodv: Internal change. -- 303793895 by jonathanhuang: Internal change. -- 303467631 by rathodv: Py3 update for detection inference test. -- 303444542 by rathodv: Py3 update to metrics module -- 303421960 by rathodv: Update json_utils to python3. -- 302787583 by ronnyvotel: Coco results generator for submission to the coco test server. -- 302719091 by Zhichao Lu: Internal change to add the ResNet50 image feature extractor for CenterNet model. -- 302116230 by Zhichao Lu: Added the functions to overlay the heatmaps with images in visualization util library. -- 301888316 by Zhichao Lu: Fix checkpoint_filepath not defined error. -- 301840312 by ronnyvotel: Adding keypoint_scores to visualizations. -- 301683475 by ronnyvotel: Introducing the ability to preprocess `keypoint_visibilities`. Some data augmentation ops such as random crop can filter instances and keypoints. It's important to also filter keypoint visibilities, so that the groundtruth tensors are always in alignment. -- 301532344 by Zhichao Lu: Don't use tf.divide since "Quantization not yet supported for op: DIV" -- 301480348 by ronnyvotel: Introducing keypoint evaluation into model lib v2. Also, making some fixes to coco keypoint evaluation. -- 301454018 by Zhichao Lu: Added the image summary to visualize the train/eval input images and eval's prediction/groundtruth side-by-side image. -- 301317527 by Zhichao Lu: Updated the random_absolute_pad_image function in the preprocessor library to support the keypoints argument. -- 301300324 by Zhichao Lu: Apply name change(experimental_run_v2 -> run) for all callers in Tensorflow. -- 301297115 by ronnyvotel: Utility function for setting keypoint visibilities based on keypoint coordinates. -- 301248885 by Zhichao Lu: Allow MultiworkerMirroredStrategy(MWMS) use by adding checkpoint handling with temporary directories in model_lib_v2. Added missing WeakKeyDictionary cfer_fn_cache field in CollectiveAllReduceStrategyExtended. -- 301224559 by Zhichao Lu: ...1) Fixes model_lib to also use keypoints while preparing model groundtruth. ...2) Tests model_lib with newly added keypoint metrics config. -- 300836556 by Zhichao Lu: Internal changes to add keypoint estimation parameters in CenterNet proto. -- 300795208 by Zhichao Lu: Updated the eval_util library to populate the keypoint groundtruth to eval_dict. -- 299474766 by Zhichao Lu: ...Modifies eval_util to create Keypoint Evaluator objects when configured in eval config. -- 299453920 by Zhichao Lu: Add swish activation as a hyperperams option. -- 299240093 by ronnyvotel: Keypoint postprocessing for CenterNetMetaArch. -- 299176395 by Zhichao Lu: Internal change. -- 299135608 by Zhichao Lu: Internal changes to refactor the CenterNet model in preparation for keypoint estimation tasks. -- 298915482 by Zhichao Lu: Make dataset_builder aware of input_context for distributed training. -- 298713595 by Zhichao Lu: Handling data with negative size boxes. -- 298695964 by Zhichao Lu: Expose change_coordinate_frame as a config parameter; fix multiclass_scores optional field. -- 298492150 by Zhichao Lu: Rename optimizer_builder_test_v2.py -> optimizer_builder_v2_test.py -- 298476471 by Zhichao Lu: Internal changes to support CenterNet keypoint estimation. -- 298365851 by ronnyvotel: Fixing a bug where groundtruth_keypoint_weights were being padded with a dynamic dimension. -- 297843700 by Zhichao Lu: Internal change. -- 297706988 by lzc: Internal change. -- 297705287 by ronnyvotel: Creating the "snapping" behavior in CenterNet, where regressed keypoints are refined with updated candidate keypoints from a heatmap. -- 297700447 by Zhichao Lu: Improve checkpoint checking logic with TF2 loop. -- 297686094 by Zhichao Lu: Convert "import tensorflow as tf" to "import tensorflow.compat.v1". -- 297670468 by lzc: Internal change. -- 297241327 by Zhichao Lu: Convert "import tensorflow as tf" to "import tensorflow.compat.v1". -- 297205959 by Zhichao Lu: Internal changes to support refactored the centernet object detection target assigner into a separate library. -- 297143806 by Zhichao Lu: Convert "import tensorflow as tf" to "import tensorflow.compat.v1". -- 297129625 by Zhichao Lu: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297117070 by Zhichao Lu: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297030190 by Zhichao Lu: Add configuration options for visualizing keypoint edges -- 296359649 by Zhichao Lu: Support DepthwiseConv2dNative (of separable conv) in weight equalization loss. -- 296290582 by Zhichao Lu: Internal change. -- 296093857 by Zhichao Lu: Internal changes to add general target assigner utilities. -- 295975116 by Zhichao Lu: Fix visualize_boxes_and_labels_on_image_array to show max_boxes_to_draw correctly. -- 295819711 by Zhichao Lu: Adds a flag to visualize_boxes_and_labels_on_image_array to skip the drawing of axis aligned bounding boxes. -- 295811929 by Zhichao Lu: Keypoint support in random_square_crop_by_scale. -- 295788458 by rathodv: Remove unused checkpoint to reduce repo size on github -- 295787184 by Zhichao Lu: Enable visualization of edges between keypoints -- 295763508 by Zhichao Lu: [Context RCNN] Add an option to enable / disable cropping feature in the post process step in the meta archtecture. -- 295605344 by Zhichao Lu: internal change. -- 294926050 by ronnyvotel: Adding per-keypoint groundtruth weights. These weights are intended to be used as multipliers in a keypoint loss function. Groundtruth keypoint weights are constructed as follows: - Initialize the weight for each keypoint type based on user-specified weights in the input_reader proto - Mask out (i.e. make zero) all keypoint weights that are not visible. -- 294829061 by lzc: Internal change. -- 294566503 by Zhichao Lu: Changed internal CenterNet Model configuration. -- 294346662 by ronnyvotel: Using NaN values in keypoint coordinates that are not visible. -- 294333339 by Zhichao Lu: Change experimetna_distribute_dataset -> experimental_distribute_dataset_from_function -- 293928752 by Zhichao Lu: Internal change -- 293909384 by Zhichao Lu: Add capabilities to train 1024x1024 CenterNet models. -- 293637554 by ronnyvotel: Adding keypoint visibilities to TfExampleDecoder. -- 293501558 by lzc: Internal change. -- 293252851 by Zhichao Lu: Change tf.gfile.GFile to tf.io.gfile.GFile. -- 292730217 by Zhichao Lu: Internal change. -- 292456563 by lzc: Internal changes. -- 292355612 by Zhichao Lu: Use tf.gather and tf.scatter_nd instead of matrix ops. -- 292245265 by rathodv: Internal -- 291989323 by richardmunoz: Refactor out building a DataDecoder from building a tf.data.Dataset. -- 291950147 by Zhichao Lu: Flip bounding boxes in arbitrary shaped tensors. -- 291401052 by huizhongc: Fix multiscale grid anchor generator to allow fully convolutional inference. When exporting model with identity_resizer as image_resizer, there is an incorrect box offset on the detection results. We add the anchor offset to address this problem. -- 291298871 by Zhichao Lu: Py3 compatibility changes. -- 290957957 by Zhichao Lu: Hourglass feature extractor for CenterNet. -- 290564372 by Zhichao Lu: Internal change. -- 290155278 by rathodv: Remove Dataset Explorer. -- 290155153 by Zhichao Lu: Internal change -- 290122054 by Zhichao Lu: Unify the format in the faster_rcnn.proto -- 290116084 by Zhichao Lu: Deprecate tensorflow.contrib. -- 290100672 by Zhichao Lu: Update MobilenetV3 SSD candidates -- 289926392 by Zhichao Lu: Internal change -- 289553440 by Zhichao Lu: [Object Detection API] Fix the comments about the dimension of the rpn_box_encodings from 4-D to 3-D. -- 288994128 by lzc: Internal changes. -- 288942194 by lzc: Internal change. -- 288746124 by Zhichao Lu: Configurable channel mean/std. dev in CenterNet feature extractors. -- 288552509 by rathodv: Internal. -- 288541285 by rathodv: Internal update. -- 288396396 by Zhichao Lu: Make object detection import contrib explicitly -- 288255791 by rathodv: Internal -- 288078600 by Zhichao Lu: Fix model_lib_v2 test -- 287952244 by rathodv: Internal -- 287921774 by Zhichao Lu: internal change -- 287906173 by Zhichao Lu: internal change -- 287889407 by jonathanhuang: PY3 compatibility -- 287889042 by rathodv: Internal -- 287876178 by Zhichao Lu: Internal change. -- 287770490 by Zhichao Lu: Add CenterNet proto and builder -- 287694213 by Zhichao Lu: Support for running multiple steps per tf.function call. -- 287377183 by jonathanhuang: PY3 compatibility -- 287371344 by rathodv: Support loading keypoint labels and ids. -- 287368213 by rathodv: Add protos supporting keypoint evaluation. -- 286673200 by rathodv: dataset_tools PY3 migration -- 286635106 by Zhichao Lu: Update code for upcoming tf.contrib removal -- 286479439 by Zhichao Lu: Internal change -- 286311711 by Zhichao Lu: Skeleton of context model within TFODAPI -- 286005546 by Zhichao Lu: Fix Faster-RCNN training when using keep_aspect_ratio_resizer with pad_to_max_dimension -- 285906400 by derekjchow: Internal change -- 285822795 by Zhichao Lu: Add CenterNet meta arch target assigners. -- 285447238 by Zhichao Lu: Internal changes. -- 285016927 by Zhichao Lu: Make _dummy_computation a tf.function. This fixes breakage caused by cl/284256438 -- 284827274 by Zhichao Lu: Convert to python 3. -- 284645593 by rathodv: Internal change -- 284639893 by rathodv: Add missing documentation for keypoints in eval_util.py. -- 284323712 by Zhichao Lu: Internal changes. -- 284295290 by Zhichao Lu: Updating input config proto and dataset builder to include context fields Updating standard_fields and tf_example_decoder to include context features -- 284226821 by derekjchow: Update exporter. -- 284211030 by Zhichao Lu: API changes in CenterNet informed by the experiments with hourlgass network. -- 284190451 by Zhichao Lu: Add support for CenterNet losses in protos and builders. -- 284093961 by lzc: Internal changes. -- 284028174 by Zhichao Lu: Internal change -- 284014719 by derekjchow: Do not pad top_down feature maps unnecessarily. -- 284005765 by Zhichao Lu: Add new pad_to_multiple_resizer -- 283858233 by Zhichao Lu: Make target assigner work when under tf.function. -- 283836611 by Zhichao Lu: Make config getters more general. -- 283808990 by Zhichao Lu: Internal change -- 283754588 by Zhichao Lu: Internal changes. -- 282460301 by Zhichao Lu: Add ability to restore v2 style checkpoints. -- 281605842 by lzc: Add option to disable loss computation in OD API eval job. -- 280298212 by Zhichao Lu: Add backwards compatible change -- 280237857 by Zhichao Lu: internal change -- PiperOrigin-RevId: 310447280

Open source MnasFPN and minor fixes to OD API (#8484)
310447280 by lzc: Internal change 310420845 by Zhichao Lu: Open source the internal Context RCNN code. -- 310362339 by Zhichao Lu: Internal change 310259448 by lzc: Update required TF version for OD API. -- 310252159 by Zhichao Lu: Port patch_ops_test to TF1/TF2 as TPUs. -- 310247180 by Zhichao Lu: Ignore keypoint heatmap loss in the regions/bounding boxes with target keypoint class but no valid keypoint annotations. -- 310178294 by Zhichao Lu: Opensource MnasFPN https://arxiv.org/abs/1912.01106 -- 310094222 by lzc: Internal changes. -- 310085250 by lzc: Internal Change. -- 310016447 by huizhongc: Remove unrecognized classes from labeled_classes. -- 310009470 by rathodv: Mark batcher.py as TF1 only. -- 310001984 by rathodv: Update core/preprocessor.py to be compatible with TF1/TF2.. -- 309455035 by Zhichao Lu: Makes the freezable_batch_norm_test run w/ v2 behavior. The main change is in v2 updates will happen right away when running batchnorm in training mode. So, we need to restore the weights between batchnorm calls to make sure the numerical checks all start from the same place. -- 309425881 by Zhichao Lu: Make TF1/TF2 optimizer builder tests explicit. -- 309408646 by Zhichao Lu: Make dataset builder tests TF1 and TF2 compatible. -- 309246305 by Zhichao Lu: Added the functionality of combining the person keypoints and object detection annotations in the binary that converts the COCO raw data to TfRecord. -- 309125076 by Zhichao Lu: Convert target_assigner_utils to TF1/TF2. -- 308966359 by huizhongc: Support SSD training with partially labeled groundtruth. -- 308937159 by rathodv: Update core/target_assigner.py to be compatible with TF1/TF2. -- 308774302 by Zhichao Lu: Internal -- 308732860 by rathodv: Make core/prefetcher.py compatible with TF1 only. -- 308726984 by rathodv: Update core/multiclass_nms_test.py to be TF1/TF2 compatible. -- 308714718 by rathodv: Update core/region_similarity_calculator_test.py to be TF1/TF2 compatible. -- 308707960 by rathodv: Update core/minibatch_sampler_test.py to be TF1/TF2 compatible. -- 308700595 by rathodv: Update core/losses_test.py to be TF1/TF2 compatible and remove losses_test_v2.py -- 308361472 by rathodv: Update core/matcher_test.py to be TF1/TF2 compatible. -- 308335846 by Zhichao Lu: Updated the COCO evaluation logics and populated the groundturth area information through. This change matches the groundtruth format expected by the COCO keypoint evaluation. -- 308256924 by rathodv: Update core/keypoints_ops_test.py to be TF1/TF2 compatible. -- 308256826 by rathodv: Update class_agnostic_nms_test.py to be TF1/TF2 compatible. -- 308256112 by rathodv: Update box_list_ops_test.py to be TF1/TF2 compatible. -- 308159360 by Zhichao Lu: Internal change 308145008 by Zhichao Lu: Added 'image/class/confidence' field in the TFExample decoder. -- 307651875 by rathodv: Refactor core/box_list.py to support TF1/TF2. -- 307651798 by rathodv: Modify box_coder.py base class to work with with TF1/TF2 -- 307651652 by rathodv: Refactor core/balanced_positive_negative_sampler.py to support TF1/TF2. -- 307651571 by rathodv: Modify BoxCoders tests to use test_case:execute method to allow testing with TF1.X and TF2.X -- 307651480 by rathodv: Modify Matcher tests to use test_case:execute method to allow testing with TF1.X and TF2.X -- 307651409 by rathodv: Modify AnchorGenerator tests to use test_case:execute method to allow testing with TF1.X and TF2.X -- 307651314 by rathodv: Refactor model_builder to support TF1 or TF2 models based on TensorFlow version. -- 307092053 by Zhichao Lu: Use manager to save checkpoint. -- 307071352 by ronnyvotel: Fixing keypoint visibilities. Now by default, the visibility is marked True if the keypoint is labeled (regardless of whether it is visible or not). Also, if visibilities are not present in the dataset, they will be created based on whether the keypoint coordinates are finite (vis = True) or NaN (vis = False). -- 307069557 by Zhichao Lu: Internal change to add few fields related to postprocessing parameters in center_net.proto and populate those parameters to the keypoint postprocessing functions. -- 307012091 by Zhichao Lu: Make Adam Optimizer's epsilon proto configurable. Potential issue: tf.compat.v1's AdamOptimizer has a default epsilon on 1e-08 ([doc-link](https://www.tensorflow.org/api_docs/python/tf/compat/v1/train/AdamOptimizer)) whereas tf.keras's AdamOptimizer has default epsilon 1e-07 ([doc-link](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam)) -- 306858598 by Zhichao Lu: Internal changes to update the CenterNet model: 1) Modified eval job loss computation to avoid averaging over batches with zero loss. 2) Updated CenterNet keypoint heatmap target assigner to apply box size to heatmap Guassian standard deviation. 3) Updated the CenterNet meta arch keypoint losses computation to apply weights outside of loss function. -- 306731223 by jonathanhuang: Internal change. -- 306549183 by rathodv: Internal Update. -- 306542930 by rathodv: Internal Update -- 306322697 by rathodv: Internal. -- 305345036 by Zhichao Lu: Adding COCO Camera Traps Json to tf.Example beam code -- 304104869 by lzc: Internal changes. -- 304068971 by jonathanhuang: Internal change. -- 304050469 by Zhichao Lu: Internal change. -- 303880642 by huizhongc: Support parsing partially labeled groundtruth. -- 303841743 by Zhichao Lu: Deprecate nms_on_host in SSDMetaArch. -- 303803204 by rathodv: Internal change. -- 303793895 by jonathanhuang: Internal change. -- 303467631 by rathodv: Py3 update for detection inference test. -- 303444542 by rathodv: Py3 update to metrics module -- 303421960 by rathodv: Update json_utils to python3. -- 302787583 by ronnyvotel: Coco results generator for submission to the coco test server. -- 302719091 by Zhichao Lu: Internal change to add the ResNet50 image feature extractor for CenterNet model. -- 302116230 by Zhichao Lu: Added the functions to overlay the heatmaps with images in visualization util library. -- 301888316 by Zhichao Lu: Fix checkpoint_filepath not defined error. -- 301840312 by ronnyvotel: Adding keypoint_scores to visualizations. -- 301683475 by ronnyvotel: Introducing the ability to preprocess `keypoint_visibilities`. Some data augmentation ops such as random crop can filter instances and keypoints. It's important to also filter keypoint visibilities, so that the groundtruth tensors are always in alignment. -- 301532344 by Zhichao Lu: Don't use tf.divide since "Quantization not yet supported for op: DIV" -- 301480348 by ronnyvotel: Introducing keypoint evaluation into model lib v2. Also, making some fixes to coco keypoint evaluation. -- 301454018 by Zhichao Lu: Added the image summary to visualize the train/eval input images and eval's prediction/groundtruth side-by-side image. -- 301317527 by Zhichao Lu: Updated the random_absolute_pad_image function in the preprocessor library to support the keypoints argument. -- 301300324 by Zhichao Lu: Apply name change(experimental_run_v2 -> run) for all callers in Tensorflow. -- 301297115 by ronnyvotel: Utility function for setting keypoint visibilities based on keypoint coordinates. -- 301248885 by Zhichao Lu: Allow MultiworkerMirroredStrategy(MWMS) use by adding checkpoint handling with temporary directories in model_lib_v2. Added missing WeakKeyDictionary cfer_fn_cache field in CollectiveAllReduceStrategyExtended. -- 301224559 by Zhichao Lu: ...1) Fixes model_lib to also use keypoints while preparing model groundtruth. ...2) Tests model_lib with newly added keypoint metrics config. -- 300836556 by Zhichao Lu: Internal changes to add keypoint estimation parameters in CenterNet proto. -- 300795208 by Zhichao Lu: Updated the eval_util library to populate the keypoint groundtruth to eval_dict. -- 299474766 by Zhichao Lu: ...Modifies eval_util to create Keypoint Evaluator objects when configured in eval config. -- 299453920 by Zhichao Lu: Add swish activation as a hyperperams option. -- 299240093 by ronnyvotel: Keypoint postprocessing for CenterNetMetaArch. -- 299176395 by Zhichao Lu: Internal change. -- 299135608 by Zhichao Lu: Internal changes to refactor the CenterNet model in preparation for keypoint estimation tasks. -- 298915482 by Zhichao Lu: Make dataset_builder aware of input_context for distributed training. -- 298713595 by Zhichao Lu: Handling data with negative size boxes. -- 298695964 by Zhichao Lu: Expose change_coordinate_frame as a config parameter; fix multiclass_scores optional field. -- 298492150 by Zhichao Lu: Rename optimizer_builder_test_v2.py -> optimizer_builder_v2_test.py -- 298476471 by Zhichao Lu: Internal changes to support CenterNet keypoint estimation. -- 298365851 by ronnyvotel: Fixing a bug where groundtruth_keypoint_weights were being padded with a dynamic dimension. -- 297843700 by Zhichao Lu: Internal change. -- 297706988 by lzc: Internal change. -- 297705287 by ronnyvotel: Creating the "snapping" behavior in CenterNet, where regressed keypoints are refined with updated candidate keypoints from a heatmap. -- 297700447 by Zhichao Lu: Improve checkpoint checking logic with TF2 loop. -- 297686094 by Zhichao Lu: Convert "import tensorflow as tf" to "import tensorflow.compat.v1". -- 297670468 by lzc: Internal change. -- 297241327 by Zhichao Lu: Convert "import tensorflow as tf" to "import tensorflow.compat.v1". -- 297205959 by Zhichao Lu: Internal changes to support refactored the centernet object detection target assigner into a separate library. -- 297143806 by Zhichao Lu: Convert "import tensorflow as tf" to "import tensorflow.compat.v1". -- 297129625 by Zhichao Lu: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297117070 by Zhichao Lu: Explicitly replace "import tensorflow" with "tensorflow.compat.v1" for TF2.x migration -- 297030190 by Zhichao Lu: Add configuration options for visualizing keypoint edges -- 296359649 by Zhichao Lu: Support DepthwiseConv2dNative (of separable conv) in weight equalization loss. -- 296290582 by Zhichao Lu: Internal change. -- 296093857 by Zhichao Lu: Internal changes to add general target assigner utilities. -- 295975116 by Zhichao Lu: Fix visualize_boxes_and_labels_on_image_array to show max_boxes_to_draw correctly. -- 295819711 by Zhichao Lu: Adds a flag to visualize_boxes_and_labels_on_image_array to skip the drawing of axis aligned bounding boxes. -- 295811929 by Zhichao Lu: Keypoint support in random_square_crop_by_scale. -- 295788458 by rathodv: Remove unused checkpoint to reduce repo size on github -- 295787184 by Zhichao Lu: Enable visualization of edges between keypoints -- 295763508 by Zhichao Lu: [Context RCNN] Add an option to enable / disable cropping feature in the post process step in the meta archtecture. -- 295605344 by Zhichao Lu: internal change. -- 294926050 by ronnyvotel: Adding per-keypoint groundtruth weights. These weights are intended to be used as multipliers in a keypoint loss function. Groundtruth keypoint weights are constructed as follows: - Initialize the weight for each keypoint type based on user-specified weights in the input_reader proto - Mask out (i.e. make zero) all keypoint weights that are not visible. -- 294829061 by lzc: Internal change. -- 294566503 by Zhichao Lu: Changed internal CenterNet Model configuration. -- 294346662 by ronnyvotel: Using NaN values in keypoint coordinates that are not visible. -- 294333339 by Zhichao Lu: Change experimetna_distribute_dataset -> experimental_distribute_dataset_from_function -- 293928752 by Zhichao Lu: Internal change -- 293909384 by Zhichao Lu: Add capabilities to train 1024x1024 CenterNet models. -- 293637554 by ronnyvotel: Adding keypoint visibilities to TfExampleDecoder. -- 293501558 by lzc: Internal change. -- 293252851 by Zhichao Lu: Change tf.gfile.GFile to tf.io.gfile.GFile. -- 292730217 by Zhichao Lu: Internal change. -- 292456563 by lzc: Internal changes. -- 292355612 by Zhichao Lu: Use tf.gather and tf.scatter_nd instead of matrix ops. -- 292245265 by rathodv: Internal -- 291989323 by richardmunoz: Refactor out building a DataDecoder from building a tf.data.Dataset. -- 291950147 by Zhichao Lu: Flip bounding boxes in arbitrary shaped tensors. -- 291401052 by huizhongc: Fix multiscale grid anchor generator to allow fully convolutional inference. When exporting model with identity_resizer as image_resizer, there is an incorrect box offset on the detection results. We add the anchor offset to address this problem. -- 291298871 by Zhichao Lu: Py3 compatibility changes. -- 290957957 by Zhichao Lu: Hourglass feature extractor for CenterNet. -- 290564372 by Zhichao Lu: Internal change. -- 290155278 by rathodv: Remove Dataset Explorer. -- 290155153 by Zhichao Lu: Internal change -- 290122054 by Zhichao Lu: Unify the format in the faster_rcnn.proto -- 290116084 by Zhichao Lu: Deprecate tensorflow.contrib. -- 290100672 by Zhichao Lu: Update MobilenetV3 SSD candidates -- 289926392 by Zhichao Lu: Internal change -- 289553440 by Zhichao Lu: [Object Detection API] Fix the comments about the dimension of the rpn_box_encodings from 4-D to 3-D. -- 288994128 by lzc: Internal changes. -- 288942194 by lzc: Internal change. -- 288746124 by Zhichao Lu: Configurable channel mean/std. dev in CenterNet feature extractors. -- 288552509 by rathodv: Internal. -- 288541285 by rathodv: Internal update. -- 288396396 by Zhichao Lu: Make object detection import contrib explicitly -- 288255791 by rathodv: Internal -- 288078600 by Zhichao Lu: Fix model_lib_v2 test -- 287952244 by rathodv: Internal -- 287921774 by Zhichao Lu: internal change -- 287906173 by Zhichao Lu: internal change -- 287889407 by jonathanhuang: PY3 compatibility -- 287889042 by rathodv: Internal -- 287876178 by Zhichao Lu: Internal change. -- 287770490 by Zhichao Lu: Add CenterNet proto and builder -- 287694213 by Zhichao Lu: Support for running multiple steps per tf.function call. -- 287377183 by jonathanhuang: PY3 compatibility -- 287371344 by rathodv: Support loading keypoint labels and ids. -- 287368213 by rathodv: Add protos supporting keypoint evaluation. -- 286673200 by rathodv: dataset_tools PY3 migration -- 286635106 by Zhichao Lu: Update code for upcoming tf.contrib removal -- 286479439 by Zhichao Lu: Internal change -- 286311711 by Zhichao Lu: Skeleton of context model within TFODAPI -- 286005546 by Zhichao Lu: Fix Faster-RCNN training when using keep_aspect_ratio_resizer with pad_to_max_dimension -- 285906400 by derekjchow: Internal change -- 285822795 by Zhichao Lu: Add CenterNet meta arch target assigners. -- 285447238 by Zhichao Lu: Internal changes. -- 285016927 by Zhichao Lu: Make _dummy_computation a tf.function. This fixes breakage caused by cl/284256438 -- 284827274 by Zhichao Lu: Convert to python 3. -- 284645593 by rathodv: Internal change -- 284639893 by rathodv: Add missing documentation for keypoints in eval_util.py. -- 284323712 by Zhichao Lu: Internal changes. -- 284295290 by Zhichao Lu: Updating input config proto and dataset builder to include context fields Updating standard_fields and tf_example_decoder to include context features -- 284226821 by derekjchow: Update exporter. -- 284211030 by Zhichao Lu: API changes in CenterNet informed by the experiments with hourlgass network. -- 284190451 by Zhichao Lu: Add support for CenterNet losses in protos and builders. -- 284093961 by lzc: Internal changes. -- 284028174 by Zhichao Lu: Internal change -- 284014719 by derekjchow: Do not pad top_down feature maps unnecessarily. -- 284005765 by Zhichao Lu: Add new pad_to_multiple_resizer -- 283858233 by Zhichao Lu: Make target assigner work when under tf.function. -- 283836611 by Zhichao Lu: Make config getters more general. -- 283808990 by Zhichao Lu: Internal change -- 283754588 by Zhichao Lu: Internal changes. -- 282460301 by Zhichao Lu: Add ability to restore v2 style checkpoints. -- 281605842 by lzc: Add option to disable loss computation in OD API eval job. -- 280298212 by Zhichao Lu: Add backwards compatible change -- 280237857 by Zhichao Lu: internal change -- PiperOrigin-RevId: 310447280
8518d053 · pkulzc · GitHub · ac5fff19 · 8518d053 · 8518d053
Unverified Commit 8518d053 authored May 12, 2020 by pkulzc Committed by GitHub May 12, 2020
20 changed files
--- a/research/object_detection/models/ssd_mobilenet_v2_mnasfpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_mnasfpn_feature_extractor.py
+# Lint as: python2, python3
+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""SSD MobilenetV2 NAS-FPN Feature Extractor."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import collections
+import functools
+from six.moves import range
+import tensorflow as tf
+
+from tensorflow.contrib import slim as contrib_slim
+from object_detection.meta_architectures import ssd_meta_arch
+from object_detection.utils import ops
+from object_detection.utils import shape_utils
+from nets.mobilenet import mobilenet
+from nets.mobilenet import mobilenet_v2
+
+slim = contrib_slim
+
+Block = collections.namedtuple(
+    'Block', ['inputs', 'output_level', 'kernel_size', 'expansion_size'])
+
+_MNASFPN_CELL_CONFIG = [
+    Block(inputs=(1, 2), output_level=4, kernel_size=3, expansion_size=256),
+    Block(inputs=(0, 4), output_level=3, kernel_size=3, expansion_size=128),
+    Block(inputs=(5, 4), output_level=4, kernel_size=3, expansion_size=128),
+    Block(inputs=(4, 3), output_level=5, kernel_size=5, expansion_size=128),
+    Block(inputs=(4, 3), output_level=6, kernel_size=3, expansion_size=96),
+]
+
+MNASFPN_DEF = dict(
+    feature_levels=[3, 4, 5, 6],
+    spec=[_MNASFPN_CELL_CONFIG] * 4,
+)
+
+
+def _maybe_pad(feature, use_explicit_padding, kernel_size=3):
+  return ops.fixed_padding(feature,
+                           kernel_size) if use_explicit_padding else feature
+
+
+# Wrapper around mobilenet.depth_multiplier
+def _apply_multiplier(d, multiplier, min_depth):
+  p = {'num_outputs': d}
+  mobilenet.depth_multiplier(
+      p, multiplier=multiplier, divisible_by=8, min_depth=min_depth)
+  return p['num_outputs']
+
+
+def _apply_size_dependent_ordering(input_feature, feature_level, block_level,
+                                   expansion_size, use_explicit_padding,
+                                   use_native_resize_op):
+  """Applies Size-Dependent-Ordering when resizing feature maps.
+
+     See https://arxiv.org/abs/1912.01106
+
+  Args:
+    input_feature: input feature map to be resized.
+    feature_level: the level of the input feature.
+    block_level: the desired output level for the block.
+    expansion_size: the expansion size for the block.
+    use_explicit_padding: Whether to use explicit padding.
+    use_native_resize_op: Whether to use native resize op.
+
+  Returns:
+    A transformed feature at the desired resolution and expansion size.
+  """
+  padding = 'VALID' if use_explicit_padding else 'SAME'
+  if feature_level >= block_level:  # Perform 1x1 then upsampling.
+    node = slim.conv2d(
+        input_feature,
+        expansion_size, [1, 1],
+        activation_fn=None,
+        normalizer_fn=slim.batch_norm,
+        padding=padding,
+        scope='Conv1x1')
+    if feature_level == block_level:
+      return node
+    scale = 2**(feature_level - block_level)
+    if use_native_resize_op:
+      input_shape = shape_utils.combined_static_and_dynamic_shape(node)
+      node = tf.image.resize_nearest_neighbor(
+          node, [input_shape[1] * scale, input_shape[2] * scale])
+    else:
+      node = ops.nearest_neighbor_upsampling(node, scale=scale)
+  else:  # Perform downsampling then 1x1.
+    stride = 2**(block_level - feature_level)
+    node = slim.max_pool2d(
+        _maybe_pad(input_feature, use_explicit_padding), [3, 3],
+        stride=[stride, stride],
+        padding=padding,
+        scope='Downsample')
+    node = slim.conv2d(
+        node,
+        expansion_size, [1, 1],
+        activation_fn=None,
+        normalizer_fn=slim.batch_norm,
+        padding=padding,
+        scope='Conv1x1')
+  return node
+
+
+def _mnasfpn_cell(feature_maps,
+                  feature_levels,
+                  cell_spec,
+                  output_channel=48,
+                  use_explicit_padding=False,
+                  use_native_resize_op=False,
+                  multiplier_func=None):
+  """Create a MnasFPN cell.
+
+  Args:
+    feature_maps: input feature maps.
+    feature_levels: levels of the feature maps.
+    cell_spec: A list of Block configs.
+    output_channel: Number of features for the input, output and intermediate
+      feature maps.
+    use_explicit_padding: Whether to use explicit padding.
+    use_native_resize_op: Whether to use native resize op.
+    multiplier_func: Depth-multiplier function. If None, use identity function.
+
+  Returns:
+    A transformed list of feature maps at the same resolutions as the inputs.
+  """
+  # This is the level where multipliers are realized.
+  if multiplier_func is None:
+    multiplier_func = lambda x: x
+  num_outputs = len(feature_maps)
+  cell_features = list(feature_maps)
+  cell_levels = list(feature_levels)
+  padding = 'VALID' if use_explicit_padding else 'SAME'
+  for bi, block in enumerate(cell_spec):
+    with tf.variable_scope('block_{}'.format(bi)):
+      block_level = block.output_level
+      intermediate_feature = None
+      for i, inp in enumerate(block.inputs):
+        with tf.variable_scope('input_{}'.format(i)):
+          input_level = cell_levels[inp]
+          node = _apply_size_dependent_ordering(
+              cell_features[inp], input_level, block_level,
+              multiplier_func(block.expansion_size), use_explicit_padding,
+              use_native_resize_op)
+        # Add features incrementally to avoid producing AddN, which doesn't
+        # play well with TfLite.
+        if intermediate_feature is None:
+          intermediate_feature = node
+        else:
+          intermediate_feature += node
+      node = tf.nn.relu6(intermediate_feature)
+      node = slim.separable_conv2d(
+          _maybe_pad(node, use_explicit_padding, block.kernel_size),
+          multiplier_func(output_channel),
+          block.kernel_size,
+          activation_fn=None,
+          normalizer_fn=slim.batch_norm,
+          padding=padding,
+          scope='SepConv')
+    cell_features.append(node)
+    cell_levels.append(block_level)
+
+  # Cell-wide residuals.
+  out_idx = range(len(cell_features) - num_outputs, len(cell_features))
+  for in_i, out_i in enumerate(out_idx):
+    if cell_features[out_i].shape.as_list(
+    ) == cell_features[in_i].shape.as_list():
+      cell_features[out_i] += cell_features[in_i]
+
+  return cell_features[-num_outputs:]
+
+
+def mnasfpn(feature_maps,
+            head_def,
+            output_channel=48,
+            use_explicit_padding=False,
+            use_native_resize_op=False,
+            multiplier_func=None):
+  """Create the MnasFPN head given head_def."""
+  features = feature_maps
+  for ci, cell_spec in enumerate(head_def['spec']):
+    with tf.variable_scope('cell_{}'.format(ci)):
+      features = _mnasfpn_cell(features, head_def['feature_levels'], cell_spec,
+                               output_channel, use_explicit_padding,
+                               use_native_resize_op, multiplier_func)
+  return features
+
+
+def training_scope(l2_weight_decay=1e-4, is_training=None):
+  """Arg scope for training MnasFPN."""
+  with slim.arg_scope(
+      [slim.conv2d],
+      weights_initializer=tf.initializers.he_normal(),
+      weights_regularizer=slim.l2_regularizer(l2_weight_decay)), \
+      slim.arg_scope(
+          [slim.separable_conv2d],
+          weights_initializer=tf.initializers.truncated_normal(
+              stddev=0.536),  # He_normal for 3x3 depthwise kernel.
+          weights_regularizer=slim.l2_regularizer(l2_weight_decay)), \
+      slim.arg_scope([slim.batch_norm],
+                     is_training=is_training,
+                     epsilon=0.01,
+                     decay=0.99,
+                     center=True,
+                     scale=True) as s:
+    return s
+
+
+class SSDMobileNetV2MnasFPNFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
+  """SSD Feature Extractor using MobilenetV2 MnasFPN features."""
+
+  def __init__(self,
+               is_training,
+               depth_multiplier,
+               min_depth,
+               pad_to_multiple,
+               conv_hyperparams_fn,
+               fpn_min_level=3,
+               fpn_max_level=6,
+               additional_layer_depth=48,
+               head_def=None,
+               reuse_weights=None,
+               use_explicit_padding=False,
+               use_depthwise=False,
+               use_native_resize_op=False,
+               override_base_feature_extractor_hyperparams=False,
+               data_format='channels_last'):
+    """SSD MnasFPN feature extractor based on Mobilenet v2 architecture.
+
+    See https://arxiv.org/abs/1912.01106
+
+    Args:
+      is_training: whether the network is in training mode.
+      depth_multiplier: float depth multiplier for feature extractor.
+      min_depth: minimum feature extractor depth.
+      pad_to_multiple: the nearest multiple to zero pad the input height and
+        width dimensions to.
+      conv_hyperparams_fn: A function to construct tf slim arg_scope for conv2d
+        and separable_conv2d ops in the layers that are added on top of the base
+        feature extractor.
+      fpn_min_level: the highest resolution feature map to use in MnasFPN.
+        Currently the only valid value is 3.
+      fpn_max_level: the smallest resolution feature map to construct or use in
+        MnasFPN. Currentl the only valid value is 6.
+      additional_layer_depth: additional feature map layer channel depth for
+        NAS-FPN.
+      head_def: A dictionary specifying the MnasFPN head architecture. Default
+        uses MNASFPN_DEF.
+      reuse_weights: whether to reuse variables. Default is None.
+      use_explicit_padding: Whether to use explicit padding when extracting
+        features. Default is False.
+      use_depthwise: Whether to use depthwise convolutions. Default is False.
+      use_native_resize_op: Whether to use native resize op. Default is False.
+      override_base_feature_extractor_hyperparams: Whether to override
+        hyperparameters of the base feature extractor with the one from
+        `conv_hyperparams_fn`.
+      data_format: The ordering of the dimensions in the inputs, The valid
+        values are {'channels_first', 'channels_last').
+    """
+    super(SSDMobileNetV2MnasFPNFeatureExtractor, self).__init__(
+        is_training=is_training,
+        depth_multiplier=depth_multiplier,
+        min_depth=min_depth,
+        pad_to_multiple=pad_to_multiple,
+        conv_hyperparams_fn=conv_hyperparams_fn,
+        reuse_weights=reuse_weights,
+        use_explicit_padding=use_explicit_padding,
+        use_depthwise=use_depthwise,
+        override_base_feature_extractor_hyperparams=(
+            override_base_feature_extractor_hyperparams))
+    if fpn_min_level != 3 or fpn_max_level != 6:
+      raise ValueError('Min and max levels of MnasFPN must be 3 and 6 for now.')
+    self._fpn_min_level = fpn_min_level
+    self._fpn_max_level = fpn_max_level
+    self._fpn_layer_depth = additional_layer_depth
+    self._head_def = head_def if head_def else MNASFPN_DEF
+    self._data_format = data_format
+    self._use_native_resize_op = use_native_resize_op
+
+  def preprocess(self, resized_inputs):
+    """SSD preprocessing.
+
+    Maps pixel values to the range [-1, 1].
+
+    Args:
+      resized_inputs: a [batch, height, width, channels] float tensor
+        representing a batch of images.
+
+    Returns:
+      preprocessed_inputs: a [batch, height, width, channels] float tensor
+        representing a batch of images.
+    """
+    return (2.0 / 255.0) * resized_inputs - 1.0
+
+  def _verify_config(self, inputs):
+    """Verify that MnasFPN config and its inputs."""
+    num_inputs = len(inputs)
+    assert len(self._head_def['feature_levels']) == num_inputs
+
+    base_width = inputs[0].shape.as_list(
+    )[1] * 2**self._head_def['feature_levels'][0]
+    for i in range(1, num_inputs):
+      width = inputs[i].shape.as_list()[1]
+      level = self._head_def['feature_levels'][i]
+      expected_width = base_width // 2**level
+      if width != expected_width:
+        raise ValueError(
+            'Resolution of input {} does not match its level {}.'.format(
+                i, level))
+
+    for cell_spec in self._head_def['spec']:
+      # The last K nodes in a cell are the inputs to the next cell. Assert that
+      # their feature maps are at the right level.
+      for i in range(num_inputs):
+        if cell_spec[-num_inputs +
+                     i].output_level != self._head_def['feature_levels'][i]:
+          raise ValueError(
+              'Mismatch between node level {} and desired output level {}.'
+              .format(cell_spec[-num_inputs + i].output_level,
+                      self._head_def['feature_levels'][i]))
+      # Assert that each block only uses precending blocks.
+      for bi, block_spec in enumerate(cell_spec):
+        for inp in block_spec.inputs:
+          if inp >= bi + num_inputs:
+            raise ValueError(
+                'Block {} is trying to access uncreated block {}.'.format(
+                    bi, inp))
+
+  def extract_features(self, preprocessed_inputs):
+    """Extract features from preprocessed inputs.
+
+    Args:
+      preprocessed_inputs: a [batch, height, width, channels] float tensor
+        representing a batch of images.
+
+    Returns:
+      feature_maps: a list of tensors where the ith tensor has shape
+        [batch, height_i, width_i, depth_i]
+    """
+    preprocessed_inputs = shape_utils.check_min_image_dim(
+        33, preprocessed_inputs)
+    with tf.variable_scope('MobilenetV2', reuse=self._reuse_weights) as scope:
+      with slim.arg_scope(
+          mobilenet_v2.training_scope(is_training=None, bn_decay=0.99)), \
+          slim.arg_scope(
+              [mobilenet.depth_multiplier], min_depth=self._min_depth):
+        with slim.arg_scope(
+            training_scope(l2_weight_decay=4e-5,
+                           is_training=self._is_training)):
+
+          _, image_features = mobilenet_v2.mobilenet_base(
+              ops.pad_to_multiple(preprocessed_inputs, self._pad_to_multiple),
+              final_endpoint='layer_18',
+              depth_multiplier=self._depth_multiplier,
+              use_explicit_padding=self._use_explicit_padding,
+              scope=scope)
+
+    multiplier_func = functools.partial(
+        _apply_multiplier,
+        multiplier=self._depth_multiplier,
+        min_depth=self._min_depth)
+    with tf.variable_scope('MnasFPN', reuse=self._reuse_weights):
+      with slim.arg_scope(
+          training_scope(l2_weight_decay=1e-4, is_training=self._is_training)):
+        # Create C6 by downsampling C5.
+        c6 = slim.max_pool2d(
+            _maybe_pad(image_features['layer_18'], self._use_explicit_padding),
+            [3, 3],
+            stride=[2, 2],
+            padding='VALID' if self._use_explicit_padding else 'SAME',
+            scope='C6_downsample')
+        c6 = slim.conv2d(
+            c6,
+            multiplier_func(self._fpn_layer_depth),
+            [1, 1],
+            activation_fn=tf.identity,
+            normalizer_fn=slim.batch_norm,
+            weights_regularizer=None,  # this 1x1 has no kernel regularizer.
+            padding='VALID',
+            scope='C6_Conv1x1')
+        image_features['C6'] = tf.identity(c6)  # Needed for quantization.
+        for k in sorted(image_features.keys()):
+          tf.logging.error('{}: {}'.format(k, image_features[k]))
+
+        mnasfpn_inputs = [
+            image_features['layer_7'],  # C3
+            image_features['layer_14'],  # C4
+            image_features['layer_18'],  # C5
+            image_features['C6']  # C6
+        ]
+        self._verify_config(mnasfpn_inputs)
+        feature_maps = mnasfpn(
+            mnasfpn_inputs,
+            head_def=self._head_def,
+            output_channel=self._fpn_layer_depth,
+            use_explicit_padding=self._use_explicit_padding,
+            use_native_resize_op=self._use_native_resize_op,
+            multiplier_func=multiplier_func)
+    return feature_maps
--- a/research/object_detection/models/ssd_mobilenet_v2_mnasfpn_feature_extractor_test.py
+++ b/research/object_detection/models/ssd_mobilenet_v2_mnasfpn_feature_extractor_test.py
+# Lint as: python2, python3
+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Tests for ssd_mobilenet_v2_nas_fpn_feature_extractor."""
+import numpy as np
+import tensorflow as tf
+
+from tensorflow.contrib import slim as contrib_slim
+from object_detection.models import ssd_feature_extractor_test
+from object_detection.models import ssd_mobilenet_v2_mnasfpn_feature_extractor as mnasfpn_feature_extractor
+
+slim = contrib_slim
+
+
+class SsdMobilenetV2MnasFPNFeatureExtractorTest(
+    ssd_feature_extractor_test.SsdFeatureExtractorTestBase):
+
+  def _create_feature_extractor(self,
+                                depth_multiplier,
+                                pad_to_multiple,
+                                use_explicit_padding=False):
+    min_depth = 16
+    is_training = True
+    fpn_num_filters = 48
+    return mnasfpn_feature_extractor.SSDMobileNetV2MnasFPNFeatureExtractor(
+        is_training,
+        depth_multiplier,
+        min_depth,
+        pad_to_multiple,
+        self.conv_hyperparams_fn,
+        additional_layer_depth=fpn_num_filters,
+        use_explicit_padding=use_explicit_padding)
+
+  def test_extract_features_returns_correct_shapes_320_256(self):
+    image_height = 320
+    image_width = 256
+    depth_multiplier = 1.0
+    pad_to_multiple = 1
+    expected_feature_map_shape = [(2, 40, 32, 48), (2, 20, 16, 48),
+                                  (2, 10, 8, 48), (2, 5, 4, 48)]
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, use_explicit_padding=False)
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, use_explicit_padding=True)
+
+  def test_extract_features_returns_correct_shapes_enforcing_min_depth(self):
+    image_height = 256
+    image_width = 256
+    depth_multiplier = 0.5**12
+    pad_to_multiple = 1
+    expected_feature_map_shape = [(2, 32, 32, 16), (2, 16, 16, 16),
+                                  (2, 8, 8, 16), (2, 4, 4, 16)]
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, use_explicit_padding=False)
+    self.check_extract_features_returns_correct_shape(
+        2, image_height, image_width, depth_multiplier, pad_to_multiple,
+        expected_feature_map_shape, use_explicit_padding=True)
+
+  def test_preprocess_returns_correct_value_range(self):
+    image_height = 320
+    image_width = 320
+    depth_multiplier = 1
+    pad_to_multiple = 1
+    test_image = np.random.rand(2, image_height, image_width, 3)
+    feature_extractor = self._create_feature_extractor(depth_multiplier,
+                                                       pad_to_multiple)
+    preprocessed_image = feature_extractor.preprocess(test_image)
+    self.assertTrue(np.all(np.less_equal(np.abs(preprocessed_image), 1.0)))
+
+
+if __name__ == '__main__':
+  tf.test.main()
--- a/research/object_detection/models/ssd_mobilenet_v3_feature_extractor.py
+++ b/research/object_detection/models/ssd_mobilenet_v3_feature_extractor.py
@@ -157,7 +157,7 @@ class SSDMobileNetV3FeatureExtractorBase(ssd_meta_arch.SSDFeatureExtractor):
              insert_1x1_conv=True,
              image_features=image_features)

-    return feature_maps.values()
+    return list(feature_maps.values())


 class SSDMobileNetV3LargeFeatureExtractor(SSDMobileNetV3FeatureExtractorBase):

--- a/research/object_detection/models/ssd_pnasnet_feature_extractor.py
+++ b/research/object_detection/models/ssd_pnasnet_feature_extractor.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -154,7 +155,7 @@ class SSDPNASNetFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
            insert_1x1_conv=True,
            image_features=image_features)

-    return feature_maps.values()
+    return list(feature_maps.values())

  def restore_from_classification_checkpoint_fn(self, feature_extractor_scope):
    """Returns a map of variables to load from a foreign checkpoint.

--- a/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -17,6 +18,11 @@
 See https://arxiv.org/abs/1708.02002 for details.
 """

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+from six.moves import range
 import tensorflow as tf
 from tensorflow.contrib import slim as contrib_slim


--- a/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor_testbase.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_feature_extractor_testbase.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -13,10 +14,14 @@
 # limitations under the License.
 # ==============================================================================
 """Tests for ssd resnet v1 FPN feature extractors."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import abc
-import itertools
 from absl.testing import parameterized
 import numpy as np
+from six.moves import zip
 import tensorflow as tf

 from object_detection.models import ssd_feature_extractor_test
@@ -112,8 +117,8 @@ class SSDResnetFPNFeatureExtractorTestBase(
    image_tensor = np.random.rand(2, image_height, image_width,
                                  3).astype(np.float32)
    feature_maps = self.execute(graph_fn, [image_tensor])
-    for feature_map, expected_shape in itertools.izip(
-        feature_maps, expected_feature_map_shape):
+    for feature_map, expected_shape in zip(feature_maps,
+                                           expected_feature_map_shape):
      self.assertAllEqual(feature_map.shape, expected_shape)

  def test_extract_features_returns_correct_shapes_with_pad_to_multiple(

--- a/research/object_detection/models/ssd_resnet_v1_fpn_keras_feature_extractor.py
+++ b/research/object_detection/models/ssd_resnet_v1_fpn_keras_feature_extractor.py
+# Lint as: python2, python3
 # Copyright 2019 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -15,6 +16,12 @@

 """SSD Keras-based ResnetV1 FPN Feature Extractor."""

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+from six.moves import range
+from six.moves import zip
 import tensorflow as tf

 from object_detection.meta_architectures import ssd_meta_arch
@@ -121,7 +128,7 @@ class SSDResNetV1FpnKerasFeatureExtractor(
    self._resnet_v1_base_model = resnet_v1_base_model
    self._resnet_v1_base_model_name = resnet_v1_base_model_name
    self._resnet_block_names = ['block1', 'block2', 'block3', 'block4']
-    self._resnet_v1 = None
+    self.classification_backbone = None
    self._fpn_features_generator = None
    self._coarse_feature_layers = []

@@ -139,7 +146,7 @@ class SSDResNetV1FpnKerasFeatureExtractor(
    output_layers = _RESNET_MODEL_OUTPUT_LAYERS[self._resnet_v1_base_model_name]
    outputs = [full_resnet_v1_model.get_layer(output_layer_name).output
               for output_layer_name in output_layers]
-    self._resnet_v1 = tf.keras.Model(
+    self.classification_backbone = tf.keras.Model(
        inputs=full_resnet_v1_model.inputs,
        outputs=outputs)
    # pylint:disable=g-long-lambda
@@ -214,13 +221,14 @@ class SSDResNetV1FpnKerasFeatureExtractor(
    preprocessed_inputs = shape_utils.check_min_image_dim(
        129, preprocessed_inputs)

-    image_features = self._resnet_v1(
+    image_features = self.classification_backbone(
        ops.pad_to_multiple(preprocessed_inputs, self._pad_to_multiple))

    feature_block_list = []
    for level in range(self._fpn_min_level, self._base_fpn_max_level + 1):
      feature_block_list.append('block{}'.format(level - 1))
-    feature_block_map = dict(zip(self._resnet_block_names, image_features))
+    feature_block_map = dict(
+        list(zip(self._resnet_block_names, image_features)))
    fpn_input_image_features = [
        (feature_block, feature_block_map[feature_block])
        for feature_block in feature_block_list]
@@ -238,6 +246,17 @@ class SSDResNetV1FpnKerasFeatureExtractor(
      feature_maps.append(last_feature_map)
    return feature_maps

+  def restore_from_classification_checkpoint_fn(self, feature_extractor_scope):
+    """Returns a map for restoring from an (object-based) checkpoint.
+
+    Args:
+      feature_extractor_scope: A scope name for the feature extractor (unused).
+
+    Returns:
+      A dict mapping keys to Keras models
+    """
+    return {'feature_extractor': self.classification_backbone}
+

 class SSDResNet50V1FpnKerasFeatureExtractor(
    SSDResNetV1FpnKerasFeatureExtractor):

--- a/research/object_detection/models/ssd_resnet_v1_ppn_feature_extractor.py
+++ b/research/object_detection/models/ssd_resnet_v1_ppn_feature_extractor.py
+# Lint as: python2, python3
 # Copyright 2018 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -162,7 +163,7 @@ class _SSDResnetPpnFeatureExtractor(ssd_meta_arch.SSDFeatureExtractor):
            image_features={
                'image_features': self._filter_features(activations)['block3']
            })
-    return feature_maps.values()
+    return list(feature_maps.values())


 class SSDResnet50V1PpnFeatureExtractor(_SSDResnetPpnFeatureExtractor):

--- a/research/object_detection/predictors/convolutional_box_predictor.py
+++ b/research/object_detection/predictors/convolutional_box_predictor.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -14,13 +15,19 @@
 # ==============================================================================

 """Convolutional Box Predictors with and without weight sharing."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
 import functools
+from six.moves import range
+from six.moves import zip
 import tensorflow as tf
+from tensorflow.contrib import slim as contrib_slim
 from object_detection.core import box_predictor
 from object_detection.utils import shape_utils
 from object_detection.utils import static_shape

-slim = tf.contrib.slim
+slim = contrib_slim

 BOX_ENCODINGS = box_predictor.BOX_ENCODINGS
 CLASS_PREDICTIONS_WITH_BACKGROUND = (

--- a/research/object_detection/predictors/convolutional_box_predictor_test.py
+++ b/research/object_detection/predictors/convolutional_box_predictor_test.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -15,8 +16,14 @@

 """Tests for object_detection.predictors.convolutional_box_predictor."""

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 from absl.testing import parameterized
 import numpy as np
+from six.moves import range
+from six.moves import zip
 import tensorflow as tf

 from google.protobuf import text_format

--- a/research/object_detection/predictors/convolutional_keras_box_predictor.py
+++ b/research/object_detection/predictors/convolutional_keras_box_predictor.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -14,8 +15,13 @@
 # ==============================================================================

 """Convolutional Box Predictors with and without weight sharing."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import collections

+from six.moves import range
 import tensorflow as tf

 from object_detection.core import box_predictor
@@ -400,7 +406,7 @@ class WeightSharedConvolutionalBoxPredictor(box_predictor.KerasBoxPredictor):
        self._head_scope_conv_layers[tower_name_scope] = conv_layers
      return base_tower_layers

-    for feature_index, input_shape in enumerate(input_shapes):
+    for feature_index in range(len(input_shapes)):
      # Additional projection layers should not be shared as input channels
      # (and thus weight shapes) are different
      inserted_layer_counter, projection_layers = (

--- a/research/object_detection/predictors/heads/box_head.py
+++ b/research/object_detection/predictors/heads/box_head.py
@@ -107,6 +107,7 @@ class MaskRCNNBoxHead(head.Head):
      box_encodings = slim.fully_connected(
          flattened_roi_pooled_features,
          number_of_boxes * self._box_code_size,
+          reuse=tf.AUTO_REUSE,
          activation_fn=None,
          scope='BoxEncodingPredictor')
    box_encodings = tf.reshape(box_encodings,

--- a/research/object_detection/predictors/heads/class_head.py
+++ b/research/object_detection/predictors/heads/class_head.py
@@ -98,6 +98,7 @@ class MaskRCNNClassHead(head.Head):
      class_predictions_with_background = slim.fully_connected(
          flattened_roi_pooled_features,
          self._num_class_slots,
+          reuse=tf.AUTO_REUSE,
          activation_fn=None,
          scope=self._scope)
    class_predictions_with_background = tf.reshape(

--- a/research/object_detection/predictors/heads/keras_mask_head.py
+++ b/research/object_detection/predictors/heads/keras_mask_head.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -19,7 +20,12 @@ Contains Mask prediction head classes for different meta architectures.
 All the mask prediction heads have a predict function that receives the
 `features` as the first argument and returns `mask_predictions`.
 """
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import math
+from six.moves import range
 import tensorflow as tf

 from object_detection.predictors.heads import head
@@ -255,9 +261,9 @@ class MaskRCNNMaskHead(head.KerasHead):
    if self._convolve_then_upsample:
      # Replace Transposed Convolution with a Nearest Neighbor upsampling step
      # followed by 3x3 convolution.
-      height_scale = self._mask_height / shape_utils.get_dim_as_int(
+      height_scale = self._mask_height // shape_utils.get_dim_as_int(
          input_shapes[1])
-      width_scale = self._mask_width / shape_utils.get_dim_as_int(
+      width_scale = self._mask_width // shape_utils.get_dim_as_int(
          input_shapes[2])
      # pylint: disable=g-long-lambda
      self._mask_predictor_layers.append(tf.keras.layers.Lambda(

--- a/research/object_detection/predictors/heads/keypoint_head.py
+++ b/research/object_detection/predictors/heads/keypoint_head.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -22,6 +23,11 @@ Keypoints could be used to represent the human body joint locations as in
 Mask RCNN paper. Or they could be used to represent different part locations of
 objects.
 """
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+from six.moves import range
 import tensorflow as tf
 from tensorflow.contrib import slim as contrib_slim


--- a/research/object_detection/predictors/heads/mask_head.py
+++ b/research/object_detection/predictors/heads/mask_head.py
+# Lint as: python2, python3
 # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -19,7 +20,12 @@ Contains Mask prediction head classes for different meta architectures.
 All the mask prediction heads have a predict function that receives the
 `features` as the first argument and returns `mask_predictions`.
 """
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import math
+from six.moves import range
 import tensorflow as tf
 from tensorflow.contrib import slim as contrib_slim

@@ -155,8 +161,8 @@ class MaskRCNNMaskHead(head.Head):
      if self._convolve_then_upsample:
        # Replace Transposed Convolution with a Nearest Neighbor upsampling step
        # followed by 3x3 convolution.
-        height_scale = self._mask_height / features.shape[1].value
-        width_scale = self._mask_width / features.shape[2].value
+        height_scale = self._mask_height // features.shape[1].value
+        width_scale = self._mask_width // features.shape[2].value
        features = ops.nearest_neighbor_upsampling(
            features, height_scale=height_scale, width_scale=width_scale)
        features = slim.conv2d(

--- a/research/object_detection/predictors/mask_rcnn_box_predictor.py
+++ b/research/object_detection/predictors/mask_rcnn_box_predictor.py
@@ -14,11 +14,11 @@
 # ==============================================================================

 """Mask R-CNN Box Predictor."""
-import tensorflow as tf
+from tensorflow.contrib import slim as contrib_slim

 from object_detection.core import box_predictor

-slim = tf.contrib.slim
+slim = contrib_slim

 BOX_ENCODINGS = box_predictor.BOX_ENCODINGS
 CLASS_PREDICTIONS_WITH_BACKGROUND = (

--- a/research/object_detection/predictors/rfcn_box_predictor.py
+++ b/research/object_detection/predictors/rfcn_box_predictor.py
@@ -15,10 +15,11 @@

 """RFCN Box Predictor."""
 import tensorflow as tf
+from tensorflow.contrib import slim as contrib_slim
 from object_detection.core import box_predictor
 from object_detection.utils import ops

-slim = tf.contrib.slim
+slim = contrib_slim

 BOX_ENCODINGS = box_predictor.BOX_ENCODINGS
 CLASS_PREDICTIONS_WITH_BACKGROUND = (

--- a/research/object_detection/protos/eval.proto
+++ b/research/object_detection/protos/eval.proto
@@ -3,7 +3,7 @@ syntax = "proto2";
 package object_detection.protos;

 // Message for configuring DetectionModel evaluation jobs (eval.py).
-// Next id - 30
+// Next id - 33
 message EvalConfig {
  optional uint32 batch_size = 25 [default = 1];
  // Number of visualization images to generate.
@@ -31,6 +31,10 @@ message EvalConfig {
  // Type of metrics to use for evaluation.
  repeated string metrics_set = 8;

+  // Type of metrics to use for evaluation. Unlike `metrics_set` above, this
+  // field allows configuring evaluation metric through config files.
+  repeated ParameterizedMetric parameterized_metric = 31;
+
  // Path to export detections to COCO compatible JSON format.
  optional string export_path = 9 [default =''];

@@ -45,7 +49,7 @@ message EvalConfig {

  // Whether to evaluate instance masks.
  // Note that since there is no evaluation code currently for instance
-  // segmenation this option is unused.
+  // segmentation this option is unused.
  optional bool eval_instance_masks = 12 [default = false];

  // Minimum score threshold for a detected object box to be visualized
@@ -90,5 +94,59 @@ message EvalConfig {
  // When this flag is set, images are not resized during evaluation.
  // When this flag is not set (default case), image are resized according
  // to the image_resizer config in the model during evaluation.
-  optional bool force_no_resize = 29 [default=false];
+  optional bool force_no_resize = 29 [default = false];
+
+  // Whether to use a dummy loss in eval so model.loss() is not executed.
+  optional bool use_dummy_loss_in_eval = 30 [default = false];
+
+  // Specifies which keypoints should be connected by an edge, which may improve
+  // visualization. An example would be human pose estimation where certain
+  // joints can be connected.
+  repeated KeypointEdge keypoint_edge = 32;
+}
+
+// A message to configure parameterized evaluation metric.
+message ParameterizedMetric {
+  oneof parameterized_metric {
+    CocoKeypointMetrics coco_keypoint_metrics = 1;
+  }
+}
+
+// A message to evaluate COCO keypoint metrics for a specific class.
+message CocoKeypointMetrics {
+  // Identifies the class of object to which keypoints belong. By default this
+  // should use the class's "display_name" in the label map.
+  optional string class_label = 1;
+  // Keypoint specific standard deviations for COCO keypoint metrics, which
+  // controls how OKS is computed.
+  // See http://cocodataset.org/#keypoints-eval for details.
+  // If your keypoints are similar to the COCO keypoints use the precomputed
+  // standard deviations below:
+  // "nose": 0.026
+  // "left_eye": 0.025
+  // "right_eye": 0.025
+  // "left_ear": 0.035
+  // "right_ear": 0.035
+  // "left_shoulder": 0.079
+  // "right_shoulder": 0.079
+  // "left_elbow": 0.072
+  // "right_elbow": 0.072
+  // "left_wrist": 0.062
+  // "right_wrist": 0.062
+  // "left_hip": 0.107
+  // "right_hip": 0.107
+  // "left_knee": 0.087
+  // "right_knee": 0.087
+  // "left_ankle": 0.089
+  // "right_ankle": 0.089
+  map<string, float> keypoint_label_to_sigmas = 2;
+}
+
+// Defines an edge that should be drawn between two keypoints.
+message KeypointEdge {
+  // Index of the keypoint where the edge starts from. Index starts at 0.
+  optional int32 start = 1;
+
+  // Index of the keypoint where the edge ends. Index starts at 0.
+  optional int32 end = 2;
 }
--- a/research/object_detection/protos/faster_rcnn.proto
+++ b/research/object_detection/protos/faster_rcnn.proto
@@ -18,9 +18,8 @@ import "object_detection/protos/post_processing.proto";
 // `first_stage_` and `second_stage_` to indicate the stage to which each
 // parameter pertains when relevant.
 message FasterRcnn {
-
  // Whether to construct only the Region Proposal Network (RPN).
-  optional int32 number_of_stages = 1 [default=2];
+  optional int32 number_of_stages = 1 [default = 2];

  // Number of classes to predict.
  optional int32 num_classes = 3;
@@ -31,7 +30,6 @@ message FasterRcnn {
  // Feature extractor config.
  optional FasterRcnnFeatureExtractor feature_extractor = 5;

-
  // (First stage) region proposal network (RPN) parameters.

  // Anchor generator to compute RPN anchors.
@@ -39,40 +37,39 @@ message FasterRcnn {

  // Atrous rate for the convolution op applied to the
  // `first_stage_features_to_crop` tensor to obtain box predictions.
-  optional int32 first_stage_atrous_rate = 7 [default=1];
+  optional int32 first_stage_atrous_rate = 7 [default = 1];

  // Hyperparameters for the convolutional RPN box predictor.
  optional Hyperparams first_stage_box_predictor_conv_hyperparams = 8;

  // Kernel size to use for the convolution op just prior to RPN box
  // predictions.
-  optional int32 first_stage_box_predictor_kernel_size = 9 [default=3];
+  optional int32 first_stage_box_predictor_kernel_size = 9 [default = 3];

  // Output depth for the convolution op just prior to RPN box predictions.
-  optional int32 first_stage_box_predictor_depth = 10 [default=512];
+  optional int32 first_stage_box_predictor_depth = 10 [default = 512];

  // The batch size to use for computing the first stage objectness and
  // location losses.
-  optional int32 first_stage_minibatch_size = 11 [default=256];
+  optional int32 first_stage_minibatch_size = 11 [default = 256];

  // Fraction of positive examples per image for the RPN.
-  optional float first_stage_positive_balance_fraction = 12 [default=0.5];
+  optional float first_stage_positive_balance_fraction = 12 [default = 0.5];

  // Non max suppression score threshold applied to first stage RPN proposals.
-  optional float first_stage_nms_score_threshold = 13 [default=0.0];
+  optional float first_stage_nms_score_threshold = 13 [default = 0.0];

  // Non max suppression IOU threshold applied to first stage RPN proposals.
-  optional float first_stage_nms_iou_threshold = 14 [default=0.7];
+  optional float first_stage_nms_iou_threshold = 14 [default = 0.7];

  // Maximum number of RPN proposals retained after first stage postprocessing.
-  optional int32 first_stage_max_proposals = 15 [default=300];
+  optional int32 first_stage_max_proposals = 15 [default = 300];

  // First stage RPN localization loss weight.
-  optional float first_stage_localization_loss_weight = 16 [default=1.0];
+  optional float first_stage_localization_loss_weight = 16 [default = 1.0];

  // First stage RPN objectness loss weight.
-  optional float first_stage_objectness_loss_weight = 17 [default=1.0];
-
+  optional float first_stage_objectness_loss_weight = 17 [default = 1.0];

  // Per-region cropping parameters.
  // Note that if a R-FCN model is constructed the per region cropping
@@ -89,7 +86,6 @@ message FasterRcnn {
  // Stride of the max pool op on the cropped feature map during ROI pooling.
  optional int32 maxpool_stride = 20;

-
  // (Second stage) box classifier parameters

  // Hyperparameters for the second stage box predictor. If box predictor type
@@ -100,10 +96,10 @@ message FasterRcnn {
  // The batch size per image used for computing the classification and refined
  // location loss of the box classifier.
  // Note that this field is ignored if `hard_example_miner` is configured.
-  optional int32 second_stage_batch_size = 22 [default=64];
+  optional int32 second_stage_batch_size = 22 [default = 64];

  // Fraction of positive examples to use per image for the box classifier.
-  optional float second_stage_balance_fraction = 23 [default=0.25];
+  optional float second_stage_balance_fraction = 23 [default = 0.25];

  // Post processing to apply on the second stage box classifier predictions.
  // Note: the `score_converter` provided to the FasterRCNNMetaArch constructor
@@ -111,15 +107,15 @@ message FasterRcnn {
  optional PostProcessing second_stage_post_processing = 24;

  // Second stage refined localization loss weight.
-  optional float second_stage_localization_loss_weight = 25 [default=1.0];
+  optional float second_stage_localization_loss_weight = 25 [default = 1.0];

  // Second stage classification loss weight
-  optional float second_stage_classification_loss_weight = 26 [default=1.0];
+  optional float second_stage_classification_loss_weight = 26 [default = 1.0];

  // Second stage instance mask loss weight. Note that this is only applicable
  // when `MaskRCNNBoxPredictor` is selected for second stage and configured to
  // predict instance masks.
-  optional float second_stage_mask_prediction_loss_weight = 27 [default=1.0];
+  optional float second_stage_mask_prediction_loss_weight = 27 [default = 1.0];

  // If not left to default, applies hard example mining only to classification
  // and localization loss..
@@ -178,6 +174,30 @@ message FasterRcnn {

  // Whether to use tf.image.combined_non_max_suppression.
  optional bool use_combined_nms_in_first_stage = 40 [default = false];
+
+  // Whether to output final box feature. If true, it will crop the feature map
+  // in the postprocess() method based on the final predictions.
+  optional bool output_final_box_features = 42 [default = false];
+
+  // Configs for context model.
+  optional Context context_config = 41;
+}
+
+message Context {
+  // Configuration proto for Context .
+  // Next id: 4
+
+  // The maximum number of contextual features per-image, used for padding
+  optional int32 max_num_context_features = 1 [default = 8500];
+
+  // The bottleneck feature dimension of the attention block.
+  optional int32 attention_bottleneck_dimension = 2 [default = 2048];
+
+  // The attention temperature.
+  optional float attention_temperature = 3 [default = 0.01];
+
+  // The context feature length.
+  optional int32 context_feature_length = 4 [default = 2057];
 }

 message FasterRcnnFeatureExtractor {
@@ -186,10 +206,10 @@ message FasterRcnnFeatureExtractor {
  optional string type = 1;

  // Output stride of extracted RPN feature map.
-  optional int32 first_stage_features_stride = 2 [default=16];
+  optional int32 first_stage_features_stride = 2 [default = 16];

  // Whether to update batch norm parameters during training or not.
  // When training with a relative large batch size (e.g. 8), it could be
  // desirable to enable batch norm update.
-  optional bool batch_norm_trainable = 3 [default=false];
+  optional bool batch_norm_trainable = 3 [default = false];
 }