Commits · c41aedff097bf5d4c936e42bd5743a63e30c7426 · ModelZoo / ResNet50_tensorflow

17 Jul, 2020 1 commit
- moving fpn message to fpn.proto · c41aedff
  syiming authored Jul 13, 2020
  
  c41aedff
17 Jun, 2020 1 commit
- Refactor tests for Object Detection API. (#8688) · 420a7253
  pkulzc authored Jun 17, 2020
```
Internal changes

--

PiperOrigin-RevId: 316837667
```
  420a7253
12 May, 2020 1 commit

Open source MnasFPN and minor fixes to OD API (#8484) · 8518d053

pkulzc authored May 12, 2020

310447280  by lzc:

    Internal change

310420845  by Zhichao Lu:

    Open source the internal Context RCNN code.

--
310362339  by Zhichao Lu:

    Internal change

310259448  by lzc:

    Update required TF version for OD API.

--
310252159  by Zhichao Lu:

    Port patch_ops_test to TF1/TF2 as TPUs.

--
310247180  by Zhichao Lu:

    Ignore keypoint heatmap loss in the regions/bounding boxes with target keypoint
    class but no valid keypoint annotations.

--
310178294  by Zhichao Lu:

    Opensource MnasFPN
    https://arxiv.org/abs/1912.01106

--
310094222  by lzc:

    Internal changes.

--
310085250  by lzc:

    Internal Change.

--
310016447  by huizhongc:

    Remove unrecognized classes from labeled_classes.

--
310009470  by rathodv:

    Mark batcher.py as TF1 only.

--
310001984  by rathodv:

    Update core/preprocessor.py to be compatible with TF1/TF2..

--
309455035  by Zhi...

8518d053

17 Oct, 2019 1 commit

Release MobileNet V3 models and SSDLite models with MobileNet V3 backbone. (#7678) · 0ba83cf0

pkulzc authored Oct 17, 2019

* Merged commit includes the following changes:
275131829  by Sergio Guadarrama:

    updates mobilenet/README.md to be github compatible adds V2+ reference to mobilenet_v1.md file and fixes invalid markdown

--
274908068  by Sergio Guadarrama:

    Opensource MobilenetV3 detection models.

--
274697808  by Sergio Guadarrama:

    Fixed cases where tf.TensorShape was constructed with float dimensions

    This is a prerequisite for making TensorShape and Dimension more strict
    about the types of their arguments.

--
273577462  by Sergio Guadarrama:

    Fixing `conv_defs['defaults']` override issue.

--
272801298  by Sergio Guadarrama:

    Adds links to trained models for Moblienet V3, adds a version of minimalistic mobilenet-v3 to the definitions.

--
268928503  by Sergio Guadarrama:

    Mobilenet v2 with group normalization.

--
263492735  by Sergio Guadarrama:

    Internal change

260037126  by Sergio Guadarrama:

    Adds an option of using a custom depthwise operation in `expanded_conv`.

--
259997001  by Sergio Guadarrama:

    Explicitly mark Python binaries/tests with python_version = "PY2".

--
252697685  by Sergio Guadarrama:

    Internal change

251918746  by Sergio Guadarrama:

    Internal change

251909704  by Sergio Guadarrama:

    Mobilenet V3 backbone implementation.

--
247510236  by Sergio Guadarrama:

    Internal change

246196802  by Sergio Guadarrama:

    Internal change

246014539  by Sergio Guadarrama:

    Internal change

245891435  by Sergio Guadarrama:

    Internal change

245834925  by Sergio Guadarrama:

    n/a

--

PiperOrigin-RevId: 275131829

* Merged commit includes the following changes:
274959989  by Zhichao Lu:

    Update detection model zoo with MobilenetV3 SSD candidates.

--
274908068  by Zhichao Lu:

    Opensource MobilenetV3 detection models.

--
274695889  by richardmunoz:

    RandomPatchGaussian preprocessing step

    This step can be used during model training to randomly apply gaussian noise to a random image patch. Example addition to an Object Detection API pipeline config:

    train_config {
      ...
      data_augmentation_options {
        random_patch_gaussian {
          random_coef: 0.5
          min_patch_size: 1
          max_patch_size: 250
          min_gaussian_stddev: 0.0
          max_gaussian_stddev: 1.0
        }
      }
      ...
    }

--
274257872  by lzc:

    Internal change.

--
274114689  by Zhichao Lu:

    Pass native_resize flag to other FPN variants.

--
274112308  by lzc:

    Internal change.

--
274090763  by richardmunoz:

    Util function for getting a patch mask on an image for use with the Object Detection API

--
274069806  by Zhichao Lu:

    Adding functions which will help compute predictions and losses for CenterNet.

--
273860828  by lzc:

    Internal change.

--
273380069  by richardmunoz:

    RandomImageDownscaleToTargetPixels preprocessing step

    This step can be used during model training to randomly downscale an image to a random target number of pixels. If the image does not contain more than the target number of pixels, then downscaling is skipped. Example addition to an Object Detection API pipeline config:

    train_config {
      ...
      data_augmentation_options {
        random_downscale_to_target_pixels {
          random_coef: 0.5
          min_target_pixels: 300000
          max_target_pixels: 500000
        }
      }
      ...
    }

--
272987602  by Zhichao Lu:

    Avoid -inf when empty box list is passed.

--
272525836  by Zhichao Lu:

    Cleanup repeated resizing code in meta archs.

--
272458667  by richardmunoz:

    RandomJpegQuality preprocessing step

    This step can be used during model training to randomly encode the image into a jpeg with a random quality level. Example addition to an Object Detection API pipeline config:

    train_config {
      ...
      data_augmentation_options {
        random_jpeg_quality {
          random_coef: 0.5
          min_jpeg_quality: 80
          max_jpeg_quality: 100
        }
      }
      ...
    }

--
271412717  by Zhichao Lu:

    Enables TPU training with the V2 eager + tf.function Object Detection training loops.

--
270744153  by Zhichao Lu:

    Adding the offset and size target assigners for CenterNet.

--
269916081  by Zhichao Lu:

    Include basic installation in Object Detection API tutorial.
    Also:
     - Use TF2.0
     - Use saved_model

--
269376056  by Zhichao Lu:

    Fix to variable loading in RetinaNet w/ custom loops. (makes the code rely on the exact name scopes that are generated a little bit less)

--
269256251  by lzc:

    Add use_partitioned_nms field to config and update post_prossing_builder to honor that flag when building nms function.

--
268865295  by Zhichao Lu:

    Adding functionality for importing and merging back internal state of the metric.

--
268640984  by Zhichao Lu:

    Fix computation of gaussian sigma value to create CenterNet heatmap target.

--
267475576  by Zhichao Lu:

    Fix for exporter trying to export non-existent exponential moving averages.

--
267286768  by Zhichao Lu:

    Update mixed-precision policy.

--
266166879  by Zhichao Lu:

    Internal change

265860884  by Zhichao Lu:

    Apply floor function to center coordinates when creating heatmap for CenterNet target.

--
265702749  by Zhichao Lu:

    Internal change

--
264241949  by ronnyvotel:

    Updating Faster R-CNN 'final_anchors' to be in normalized coordinates.

--
264175192  by lzc:

    Update model_fn to only read hparams if it is not None.

--
264159328  by Zhichao Lu:

    Modify nearest neighbor upsampling to eliminate a multiply operation. For quantized models, the multiply operation gets unnecessarily quantized and reduces accuracy (simple stacking would work in place of the broadcast op which doesn't require quantization). Also removes an unnecessary reshape op.

--
263668306  by Zhichao Lu:

    Add the option to use dynamic map_fn for batch NMS

--
263031163  by Zhichao Lu:

    Mark outside compilation for NMS as optional.

--
263024916  by Zhichao Lu:

    Add an ExperimentalModel meta arch for experimenting with new model types.

--
262655894  by Zhichao Lu:

    Add the center heatmap target assigner for CenterNet

--
262431036  by Zhichao Lu:

    Adding add_eval_dict to allow for evaluation on model_v2

--
262035351  by ronnyvotel:

    Removing any non-Tensor predictions from the third stage of Mask R-CNN.

--
261953416  by Zhichao Lu:

    Internal change.

--
261834966  by Zhichao Lu:

    Fix the NMS OOM issue on TPU by forcing NMS to run outside of TPU.

--
261775941  by Zhichao Lu:

    Make Keras InputLayer compatible with both TF 1.x and TF 2.0.

--
261775633  by Zhichao Lu:

    Visualize additional channels with ground-truth bounding boxes.

--
261768117  by lzc:

    Internal change.

--
261766773  by ronnyvotel:

    Exposing `return_raw_detections_during_predict` in Faster R-CNN Proto.

--
260975089  by ronnyvotel:

    Moving calculation of batched prediction tensor names after all tensors in prediction dictionary are created.

--
259816913  by ronnyvotel:

    Adding raw detection boxes and feature map indices to SSD

--
259791955  by Zhichao Lu:

    Added a flag to control the use partitioned_non_max_suppression.

--
259580475  by Zhichao Lu:

    Tweak quantization-aware training re-writer to support NasFpn model architecture.

--
259579943  by rathodv:

    Add a meta target assigner proto and builders in OD API.

--
259577741  by Zhichao Lu:

    Internal change.

--
259366315  by lzc:

    Internal change.

--
259344310  by ronnyvotel:

    Updating faster rcnn so that raw_detection_boxes from predict() are in normalized coordinates.

--
259338670  by Zhichao Lu:

    Add support for use_native_resize_op to more feature extractors. Use dynamic shapes when static shapes are not available.

--
259083543  by ronnyvotel:

    Updating/fixing documentation.

--
259078937  by rathodv:

    Add prediction fields for tensors returned from detection_model.predict.

--
259044601  by Zhichao Lu:

    Add protocol buffer and builders for temperature scaling calibration.

--
259036770  by lzc:

    Internal changes.

--
259006223  by ronnyvotel:

    Adding detection anchor indices to Faster R-CNN Config. This is useful when one wishes to associate final detections and the anchors (or pre-nms boxes) from which they originated.

--
258872501  by Zhichao Lu:

    Run the training pipeline of ssd + resnet_v1_50 + fpn with a checkpoint.

--
258840686  by ronnyvotel:

    Adding standard outputs to DetectionModel.predict(). This CL only updates Faster R-CNN. Other meta architectures will be updated in future CLs.

--
258672969  by lzc:

    Internal change.

--
258649494  by lzc:

    Internal changes.

--
258630321  by ronnyvotel:

    Fixing documentation in shape_utils.flatten_dimensions().

--
258468145  by Zhichao Lu:

    Add additional output tensors parameter to Postprocess op.

--
258099219  by Zhichao Lu:

    Internal changes

--

PiperOrigin-RevId: 274959989

0ba83cf0

09 Oct, 2019 1 commit

Add Combined NMS (#6138) · 3980d2a1

Pooya Davoodi authored Oct 09, 2019

* Updating python API to use CombinedNonMaxSuppresion TF operator

1. Adds a unit test to test post_processing python API
2. Currently sets clip_window to None as the kernel uses the default
   clip_window of [0,0,1,1]
3. Added use_static_shapes to the API. In old API if
   use_static_shapes is true, then it pads/clips outputs to max_total_size, if
specified. If not specified, it pads to num_classes*max_size_per_class.
 If use_static_shapes is false, it always pads/clips to max_total_size.

Update unit test to account for clipped bouding boxes

Changed the name to CombinedNonMaxSuppression based on feedback from Google

Added additional parameters to combinedNMS python function. They are currently
unused and required for networks like FasterRCNN and MaskRCNN

* Delete selected_indices from API

Because it was removed from CombinedNMS recently in the PR.

* Improve doc of function combined_non_max_suppression

* Enable CombinedNonMaxSuppression for first_stage_nms

* fix bug

* Ensure agnostic_nms is not used with combined_nms

Remove redundant arguments from combined_nms

* Fix pylint

* Add checks for unsupported args

* Fix pylint

* Move combined_non_max_suppression to batch_multiclass_non_max_suppression

Also rename combined_nms to use_combined_nms

* Delete combined_nms for first_stage_nms because it does not work

* Revert "Delete combined_nms for first_stage_nms because it does not work"

This reverts commit 2a3cc5145f17cee630a67ddedd20e90c2920fa9f.

* Use nmsed_additional_fields.get to avoid error

* Merge combined_non_max_suppression with main nms function

* Rename combined_nms for first stage nms

* Improve  docs

* Use assertListEqual for numpy arrays

* Fix pylint errors

* End comments with period

3980d2a1

02 Nov, 2018 1 commit

Minor fixes for object detection (#5613) · 31ae57eb

pkulzc authored Nov 02, 2018

* Internal change.

PiperOrigin-RevId: 213914693

* Add original_image_spatial_shape tensor in input dictionary to store shape of the original input image

PiperOrigin-RevId: 214018767

* Remove "groundtruth_confidences" from decoders use "groundtruth_weights" to indicate label confidence.

This also solves a bug that only surfaced now - random crop routines in core/preprocessor.py did not correctly handle "groundtruth_weight" tensors returned by the decoders.

PiperOrigin-RevId: 214091843

* Update CocoMaskEvaluator to allow for a batch of image info, rather than a single image.

PiperOrigin-RevId: 214295305

* Adding the option to be able to summarize gradients.

PiperOrigin-RevId: 214310875

* Adds FasterRCNN inference on CPU

1. Adds a flag use_static_shapes_for_eval to restrict to the ops that guarantees static shape.
2. No filtering of overlapping anchors while clipping the anchors when use_static_shapes_for_eval is set to True.
3. Adds test for faster_rcnn_meta_arch for predict and postprocess in inference mode for first and second stages.

PiperOrigin-RevId: 214329565

* Fix model_lib eval_spec_names assignment (integer->string).

PiperOrigin-RevId: 214335461

* Refactor Mask HEAD to optionally upsample after applying convolutions on ROI crops.

PiperOrigin-RevId: 214338440

* Uses final_exporter_name as exporter_name for the first eval spec for backward compatibility.

PiperOrigin-RevId: 214522032

* Add reshaped `mask_predictions` tensor to the prediction dictionary in `_predict_third_stage` method to allow computing mask loss in eval job.

PiperOrigin-RevId: 214620716

* Add support for fully conv training to fpn.

PiperOrigin-RevId: 214626274

* Fix the proprocess() function in Resnet v1 to make it work for any number of input channels.

Note: If the #channels != 3, this will simply skip the mean subtraction in preprocess() function.
PiperOrigin-RevId: 214635428

* Wrap result_dict_for_single_example in eval_util to run for batched examples.

PiperOrigin-RevId: 214678514

* Adds PNASNet-based (ImageNet model) feature extractor for SSD.

PiperOrigin-RevId: 214988331

* Update documentation

PiperOrigin-RevId: 215243502

* Correct index used to compute number of groundtruth/detection boxes in COCOMaskEvaluator.

Due to an incorrect indexing in cl/214295305 only the first detection mask and first groundtruth mask for a given image are fed to the COCO Mask evaluation library. Since groundtruth masks are arranged in no particular order, the first and highest scoring detection mask (detection masks are ordered by score) won't match the the first and only groundtruth retained in all cases. This is I think why mask evaluation metrics do not get better than ~11 mAP. Note that this code path is only active when using model_main.py binary for evaluation.

This change fixes the indices and modifies an existing test case to cover it.

PiperOrigin-RevId: 215275936

* Fixing grayscale_image_resizer to accept mask as input.

PiperOrigin-RevId: 215345836

* Add an option not to clip groundtruth boxes during preprocessing. Clipping boxes adversely affects training for partially occluded or large objects, especially for fully conv models. Clipping already occurs during postprocessing, and should not occur during training.

PiperOrigin-RevId: 215613379

* Always return recalls and precisions with length equal to the number of classes.

The previous behavior of ObjectDetectionEvaluation was somewhat dangerous: when no groundtruth boxes were present, the lists of per-class precisions and recalls were simply truncated. Unless you were aware of this phenomenon (and consulted the `num_gt_instances_per_class` vector) it was difficult to associate each metric with each class.

PiperOrigin-RevId: 215633711

* Expose the box feature node in SSD.

PiperOrigin-RevId: 215653316

* Fix ssd mobilenet v2 _CONV_DEFS overwriting issue.

PiperOrigin-RevId: 215654160

* More documentation updates

PiperOrigin-RevId: 215656580

* Add pooling + residual option in multi_resolution_feature_maps. It adds an average pooling and a residual layer between feature maps with matching depth. Designed to be used with WeightSharedBoxPredictor.

PiperOrigin-RevId: 215665619

* Only call create_modificed_mobilenet_config on init if use_depthwise is true.

PiperOrigin-RevId: 215784290

* Only call create_modificed_mobilenet_config on init if use_depthwise is true.

PiperOrigin-RevId: 215837524

* Don't prune keypoints if clip_boxes is false.

PiperOrigin-RevId: 216187642

* Makes sure "key" field exists in the result dictionary.

PiperOrigin-RevId: 216456543

* Add add_background_class parameter to allow disabling the inclusion of a background class.

PiperOrigin-RevId: 216567612

* Update expected_classification_loss_under_sampling to better account for expected sampling.

PiperOrigin-RevId: 216712287

* Let the evaluation receive a evaluation class in its constructor.

PiperOrigin-RevId: 216769374

* This CL adds model building & training support for end-to-end Keras-based SSD models. If a Keras feature extractor's name is specified in the model config (e.g. 'ssd_mobilenet_v2_keras'), the model will use that feature extractor and a corresponding Keras-based box predictor.

This CL makes sure regularization losses & batch norm updates work correctly when training models that have Keras-based components. It also updates the default hyperparameter settings of the keras-based mobilenetV2 (when not overriding hyperparams) to more closely match the legacy Slim training scope.

PiperOrigin-RevId: 216938707

* Adding the ability in the coco evaluator to indicate whether an image has been annotated. For a non-annotated image, detections and groundtruth are not supplied.

PiperOrigin-RevId: 217316342

* Release the 8k minival dataset ids for MSCOCO, used in Huang et al. "Speed/accuracy trade-offs for modern convolutional object detectors" (https://arxiv.org/abs/1611.10012)

PiperOrigin-RevId: 217549353

* Exposes weighted_sigmoid_focal loss for faster rcnn classifier

PiperOrigin-RevId: 217601740

* Add detection_features to output nodes. The shape of the feature is [batch_size, max_detections, depth].

PiperOrigin-RevId: 217629905

* FPN uses a custom NN resize op for TPU-compatibility. Replace this op with the Tensorflow version at export time for TFLite-compatibility.

PiperOrigin-RevId: 217721184

* Compute `num_groundtruth_boxes` in inputs.tranform_input_data_fn after data augmentation instead of decoders.

PiperOrigin-RevId: 217733432

* 1. Stop gradients from flowing into groundtruth masks with zero paddings.
2. Normalize pixelwise cross entropy loss across the whole batch.

PiperOrigin-RevId: 217735114

* Optimize Input pipeline for Mask R-CNN on TPU with blfoat16: improve the step time from:
1663.6 ms -> 1184.2 ms, about 28.8% improvement.

PiperOrigin-RevId: 217748833

* Fixes to export a TPU compatible model

Adds nodes to each of the output tensor. Also increments the value of class labels by 1.

PiperOrigin-RevId: 217856760

* API changes:
 - change the interface of target assigner to return per-class weights.
 - change the interface of classification loss to take per-class weights.

PiperOrigin-RevId: 217968393

* Add an option to override pipeline config in export_saved_model using command line arg

PiperOrigin-RevId: 218429292

* Include Quantized trained MobileNet V2 SSD and FaceSsd in model zoo.

PiperOrigin-RevId: 218530947

* Write final config to disk in `train` mode only.

PiperOrigin-RevId: 218735512

31ae57eb

21 Sep, 2018 1 commit

Release iNaturalist Species-trained models, refactor of evaluation, box... · 99256cf4

pkulzc authored Sep 21, 2018

Release iNaturalist Species-trained models, refactor of evaluation, box predictor for object detection. (#5289)

* Merged commit includes the following changes:
212389173  by Zhichao Lu:

    1. Replace tf.boolean_mask with tf.where

--
212282646  by Zhichao Lu:

    1. Fix a typo in model_builder.py and add a test to cover it.

--
212142989  by Zhichao Lu:

    Only resize masks in meta architecture if it has not already been resized in the input pipeline.

--
212136935  by Zhichao Lu:

    Choose matmul or native crop_and_resize in the model builder instead of faster r-cnn meta architecture.

--
211907984  by Zhichao Lu:

    Make eval input reader repeated field and update config util to handle this field.

--
211858098  by Zhichao Lu:

    Change the implementation of merge_boxes_with_multiple_labels.

--
211843915  by Zhichao Lu:

    Add Mobilenet v2 + FPN support.

--
211655076  by Zhichao Lu:

    Bug fix for generic keys in config overrides

    In generic configuration overrides, we had a duplicate entry for train_input_config and we were missing the eval_input_config and eval_config.

    This change also introduces testing for all config overrides.

--
211157501  by Zhichao Lu:

    Make the locally-modified conv defs a copy.

    So that it doesn't modify MobileNet conv defs globally for other code that
    transitively imports this package.

--
211112813  by Zhichao Lu:

    Refactoring visualization tools for Estimator's eval_metric_ops. This will make it easier for future models to take advantage of a single interface and mechanics.

--
211109571  by Zhichao Lu:

    A test decorator.

--
210747685  by Zhichao Lu:

    For FPN, when use_depthwise is set to true, use slightly modified mobilenet v1 config.

--
210723882  by Zhichao Lu:

    Integrating the losses mask into the meta architectures. When providing groundtruth, one can optionally specify annotation information (i.e. which images are labeled vs. unlabeled). For any image that is unlabeled, there is no loss accumulation.

--
210673675  by Zhichao Lu:

    Internal change.

--
210546590  by Zhichao Lu:

    Internal change.

--
210529752  by Zhichao Lu:

    Support batched inputs with ops.matmul_crop_and_resize.

    With this change the new inputs are images of shape [batch, heigh, width, depth] and boxes of shape [batch, num_boxes, 4]. The output tensor is of the shape [batch, num_boxes, crop_height, crop_width, depth].

--
210485912  by Zhichao Lu:

    Fix TensorFlow version check in object_detection_tutorial.ipynb

--
210484076  by Zhichao Lu:

    Reduce TPU memory required for single image matmul_crop_and_resize.

    Using tf.einsum eliminates intermediate tensors, tiling and expansion. for an image of size [40, 40, 1024] and boxes of shape [300, 4] HBM memory usage goes down from 3.52G to 1.67G.

--
210468361  by Zhichao Lu:

    Remove PositiveAnchorLossCDF/NegativeAnchorLossCDF to resolve "Main thread is not in main loop error" issue in local training.

--
210100253  by Zhichao Lu:

    Pooling pyramid feature maps: add option to replace max pool with convolution layers.

--
209995842  by Zhichao Lu:

    Fix a bug which prevents variable sharing in Faster RCNN.

--
209965526  by Zhichao Lu:

    Add support for enabling export_to_tpu through the estimator.

--
209946440  by Zhichao Lu:

    Replace deprecated tf.train.Supervisor with tf.train.MonitoredSession. MonitoredSession also takes away the hassle of starting queue runners.

--
209888003  by Zhichao Lu:

    Implement function to handle data where source_id is not set.

    If the field source_id is found to be the empty string for any image during runtime, it will be replaced with a random string. This avoids hash-collisions on dataset where many examples do not have source_id set. Those hash-collisions have unintended site effects and may lead to bugs in the detection pipeline.

--
209842134  by Zhichao Lu:

    Converting loss mask into multiplier, rather than using it as a boolean mask (which changes tensor shape). This is necessary, since other utilities (e.g. hard example miner) require a loss matrix with the same dimensions as the original prediction tensor.

--
209768066  by Zhichao Lu:

    Adding ability to remove loss computation from specific images in a batch, via an optional boolean mask.

--
209722556  by Zhichao Lu:

    Remove dead code.

    (_USE_C_API was flipped to True by default in TensorFlow 1.8)

--
209701861  by Zhichao Lu:

    This CL cleans-up some tf.Example creation snippets, by reusing the convenient tf.train.Feature building functions in dataset_util.

--
209697893  by Zhichao Lu:

    Do not overwrite num_epoch for eval input. This leads to errors in some cases.

--
209694652  by Zhichao Lu:

    Sample boxes by jittering around the currently given boxes.

--
209550300  by Zhichao Lu:

    `create_category_index_from_labelmap()` function now accepts `use_display_name` parameter.
    Also added create_categories_from_labelmap function for convenience

--
209490273  by Zhichao Lu:

    Check result_dict type before accessing image_id via key.

--
209442529  by Zhichao Lu:

    Introducing the capability to sample examples for evaluation. This makes it easy to specify one full epoch of evaluation, or a subset (e.g. sample 1 of every N examples).

--
208941150  by Zhichao Lu:

    Adding the capability of exporting the results in json format.

--
208888798  by Zhichao Lu:

    Fixes wrong dictionary key for num_det_boxes_per_image.

--
208873549  by Zhichao Lu:

    Reduce the number of HLO ops created by matmul_crop_and_resize.

    Do not unroll along the channels dimension. Instead, transpose the input image dimensions, apply tf.matmul and transpose back.

    The number of HLO instructions for 1024 channels reduce from 12368 to 110.

--
208844315  by Zhichao Lu:

    Add an option to use tf.non_maximal_supression_padded in SSD post-process

--
208731380  by Zhichao Lu:

    Add field in box_predictor config to enable mask prediction and update builders accordingly.

--
208699405  by Zhichao Lu:

    This CL creates a keras-based multi-resolution feature map extractor.

--
208557208  by Zhichao Lu:

    Add TPU tests for Faster R-CNN Meta arch.

    * Tests that two_stage_predict and total_loss tests run successfully on TPU.
    * Small mods to multiclass_non_max_suppression to preserve static shapes.

--
208499278  by Zhichao Lu:

    This CL makes sure the Keras convolutional box predictor & head layers apply activation layers *after* normalization (as opposed to before).

--
208391694  by Zhichao Lu:

    Updating visualization tool to produce multiple evaluation images.

--
208275961  by Zhichao Lu:

    This CL adds a Keras version of the Convolutional Box Predictor, as well as more general infrastructure for making Keras Prediction heads & Keras box predictors.

--
208275585  by Zhichao Lu:

    This CL enables the Keras layer hyperparameter object to build a dedicated activation layer, and to disable activation by default in the op layer construction kwargs.

    This is necessary because in most cases the normalization layer must be applied before the activation layer. So, in Keras models we must set the convolution activation in a dedicated layer after normalization is applied, rather than setting it in the convolution layer construction args.

--
208263792  by Zhichao Lu:

    Add a new SSD mask meta arch that can predict masks for SSD models.
    Changes including:
     - overwrite loss function to add mask loss computation.
     - update ssd_meta_arch to handle masks if predicted in predict and postprocessing.

--
208000218  by Zhichao Lu:

    Make FasterRCNN choose static shape operations only in training mode.

--
207997797  by Zhichao Lu:

    Add static boolean_mask op to box_list_ops.py and use that in faster_rcnn_meta_arch.py to support use_static_shapes option.

--
207993460  by Zhichao Lu:

    Include FGVC detection models in model zoo.

--
207971213  by Zhichao Lu:

    remove the restriction to run tf.nn.top_k op on CPU

--
207961187  by Zhichao Lu:

    Build the first stage NMS function in the model builder and pass it to FasterRCNN meta arch.

--
207960608  by Zhichao Lu:

    Internal Change.

--
207927015  by Zhichao Lu:

    Have an option to use the TPU compatible NMS op cl/206673787, in the batch_multiclass_non_max_suppression function. On setting pad_to_max_output_size to true, the output nmsed boxes are padded to be of length max_size_per_class.

    This can be used in first stage Region Proposal Network in FasterRCNN model by setting the first_stage_nms_pad_to_max_proposals field to true in config proto.

--
207809668  by Zhichao Lu:

    Add option to use depthwise separable conv instead of conv2d in FPN and WeightSharedBoxPredictor. More specifically, there are two related configs:
    - SsdFeatureExtractor.use_depthwise
    - WeightSharedConvolutionalBoxPredictor.use_depthwise

--
207808651  by Zhichao Lu:

    Fix the static balanced positive negative sampler's TPU tests

--
207798658  by Zhichao Lu:

    Fixes a post-refactoring bug where the pre-prediction convolution layers in the convolutional box predictor are ignored.

--
207796470  by Zhichao Lu:

    Make slim endpoints visible in FasterRCNNMetaArch.

--
207787053  by Zhichao Lu:

    Refactor ssd_meta_arch so that the target assigner instance is passed into the SSDMetaArch constructor rather than constructed inside.

--

PiperOrigin-RevId: 212389173

* Fix detection model zoo typo.

* Modify tf example decoder to handle label maps with either `display_name` or `name` fields seamlessly.

Currently, tf example decoder uses only `name` field to look up ids for class text field present in the data. This change uses both `display_name` and `name` fields in the label map to fetch ids for class text.

PiperOrigin-RevId: 212672223

* Modify create_coco_tf_record tool to write out class text instead of class labels.

PiperOrigin-RevId: 212679112

* Fix detection model zoo typo.

PiperOrigin-RevId: 212715692

* Adding the following two optional flags to WeightSharedConvolutionalBoxHead:
1) In the box head, apply clipping to box encodings in the box head.
2) In the class head, apply sigmoid to class predictions at inference time.

PiperOrigin-RevId: 212723242

* Support class confidences in merge boxes with multiple labels.

PiperOrigin-RevId: 212884998

* Creates multiple eval specs for object detection.

PiperOrigin-RevId: 212894556

* Set batch_norm on last layer in Mask Head to None.

PiperOrigin-RevId: 213030087

* Enable bfloat16 training for object detection models.

PiperOrigin-RevId: 213053547

* Skip padding op when unnecessary.

PiperOrigin-RevId: 213065869

* Modify `Matchers` to use groundtruth weights before performing matching.

Groundtruth weights tensor is used to indicate padding in groundtruth box tensor. It is handled in `TargetAssigner` by creating appropriate classification and regression target weights based on the groundtruth box each anchor matches to. However, options such as `force_match_all_rows` in `ArgmaxMatcher` force certain anchors to match to groundtruth boxes that are just paddings thereby reducing the number of anchors that could otherwise match to real groundtruth boxes.

For single stage models like SSD the effect of this is negligible as there are two orders of magnitude more anchors than the number of padded groundtruth boxes. But for Faster R-CNN and Mask R-CNN where there are only 300 anchors in the second stage, a significant number of these match to groundtruth paddings reducing the number of anchors regressing to real groundtruth boxes degrading the performance severely.

Therefore, this change introduces an additional boolean argument `valid_rows` to `Matcher.match` methods and the implementations now ignore such padded groudtruth boxes during matching.

PiperOrigin-RevId: 213345395

* Add release note for iNaturalist Species trained models.

PiperOrigin-RevId: 213347179

* Fix the bug of uninitialized gt_is_crowd_list variable.

PiperOrigin-RevId: 213364858

* ...text exposed to open source public git repo...

PiperOrigin-RevId: 213554260

99256cf4

01 Aug, 2018 1 commit

Refactor object detection box predictors and fix some issues with model_main. (#4965) · 02a9969e

pkulzc authored Aug 01, 2018

* Merged commit includes the following changes:
206852642 by Zhichao Lu:

Build the balanced_positive_negative_sampler in the model builder for FasterRCNN. Also adds an option to use the static implementation of the sampler.

--
206803260 by Zhichao Lu:

Fixes a misplaced argument in resnet fpn feature extractor.

--
206682736 by Zhichao Lu:

This CL modifies the SSD meta architecture to support both Slim-based and Keras-based box predictors, and begins preparation for Keras box predictor support in the other meta architectures.

Concretely, this CL adds a new `KerasBoxPredictor` base class and makes the meta architectures appropriately call whichever box predictors they are using.

We can switch the non-ssd meta architectures to fully support Keras box predictors once the Keras Convolutional Box Predictor CL is submitted.

--
206669634 by Zhichao Lu:

Adds an alternate method for balanced positive negative sampler using static shapes.

--
206643278 by Zhichao Lu:

This CL adds a Keras layer hyperparameter configuration object to the hyperparams_builder.

It automatically converts from Slim layer hyperparameter configs to Keras layer hyperparameters. Namely, it:
- Builds Keras initializers/regularizers instead of Slim ones
- sets weights_regularizer/initializer to kernel_regularizer/initializer
- converts batchnorm decay to momentum
- converts Slim l2 regularizer weights to the equivalent Keras l2 weights

This will be used in the conversion of object detection feature extractors & box predictors to newer Tensorflow APIs.

--
206611681 by Zhichao Lu:

Internal changes.

--
206591619 by Zhichao Lu:

Clip the to shape when the input tensors are larger than the expected padded static shape

--
206517644 by Zhichao Lu:

Make MultiscaleGridAnchorGenerator more consistent with MultipleGridAnchorGenerator.

--
206415624 by Zhichao Lu:

Make the hardcoded feature pyramid network (FPN) levels configurable for both SSD
Resnet and SSD Mobilenet.

--
206398204 by Zhichao Lu:

This CL modifies the SSD meta architecture to support both Slim-based and Keras-based feature extractors.

This allows us to begin the conversion of object detection to newer Tensorflow APIs.

--
206213448 by Zhichao Lu:

Adding a method to compute the expected classification loss by background/foreground weighting.

--
206204232 by Zhichao Lu:

Adding the keypoint head to the Mask RCNN pipeline.

--
206200352 by Zhichao Lu:

- Create Faster R-CNN target assigner in the model builder. This allows configuring matchers in Target assigner to use TPU compatible ops (tf.gather in this case) without any change in meta architecture.
- As a +ve side effect of the refactoring, we can now re-use a single target assigner for all of second stage heads in Faster R-CNN.

--
206178206 by Zhichao Lu:

Force ssd feature extractor builder to use keyword arguments so values won't be passed to wrong arguments.

--
206168297 by Zhichao Lu:

Updating exporter to use freeze_graph.freeze_graph_with_def_protos rather than a homegrown version.

--
206080748 by Zhichao Lu:

Merge external contributions.

--
206074460 by Zhichao Lu:

Update to preprocessor to apply temperature and softmax to the multiclass scores on read.

--
205960802 by Zhichao Lu:

Fixing a bug in hierarchical label expansion script.

--
205944686 by Zhichao Lu:

Update exporter to support exporting quantized model.

--
205912529 by Zhichao Lu:

Add a two stage matcher to allow for thresholding by one criteria and then argmaxing on the other.

--
205909017 by Zhichao Lu:

Add test for grayscale image_resizer

--
205892801 by Zhichao Lu:

Add flag to decide whether to apply batch norm to conv layers of weight shared box predictor.

--
205824449 by Zhichao Lu:

make sure that by default mask rcnn box predictor predicts 2 stages.

--
205730139 by Zhichao Lu:

Updating warning message to be more explicit about variable size mismatch.

--
205696992 by Zhichao Lu:

Remove utils/ops.py's dependency on core/box_list_ops.py. This will allow re-using TPU compatible ops from utils/ops.py in core/box_list_ops.py.

--
205696867 by Zhichao Lu:

Refactoring mask rcnn predictor so have each head in a separate file.
This CL lets us to add new heads more easily in the future to mask rcnn.

--
205492073 by Zhichao Lu:

Refactor R-FCN box predictor to be TPU compliant.

- Change utils/ops.py:position_sensitive_crop_regions to operate on single image and set of boxes without `box_ind`
- Add a batch version that operations on batches of images and batches of boxes.
- Refactor R-FCN box predictor to use the batched version of position sensitive crop regions.

--
205453567 by Zhichao Lu:

Fix bug that cannot export inference graph when write_inference_graph flag is True.

--
205316039 by Zhichao Lu:

Changing input tensor name.

--
205256307 by Zhichao Lu:

Fix model zoo links for quantized model.

--
205164432 by Zhichao Lu:

Fixes eval error when label map contains non-ascii characters.

--
205129842 by Zhichao Lu:

Adds a option to clip the anchors to the window size without filtering the overlapped boxes in Faster-RCNN

--
205094863 by Zhichao Lu:

Update to label map util to allow the option of adding a background class and fill in gaps in the label map. Useful for using multiclass scores which require a complete label map with explicit background label.

--
204989032 by Zhichao Lu:

Add tf.prof support to exporter.

--
204825267 by Zhichao Lu:

Modify mask rcnn box predictor tests for TPU compatibility.

--
204778749 by Zhichao Lu:

Remove score filtering from postprocessing.py and rely on filtering logic in tf.image.non_max_suppression

--
204775818 by Zhichao Lu:

Python3 fixes for object_detection.

--
204745920 by Zhichao Lu:

Object Detection Dataset visualization tool (documentation).

--
204686993 by Zhichao Lu:

Internal changes.

--
204559667 by Zhichao Lu:

Refactor box_predictor.py into multiple files.
The abstract base class remains in the object_detection/core, The other classes have moved to a separate file each in object_detection/predictors

--
204552847 by Zhichao Lu:

Update blog post link.

--
204508028 by Zhichao Lu:

Bump down the batch size to 1024 to be a bit more tolerant to OOM and double the number of iterations. This job still converges to 20.5 mAP in 3 hours.

PiperOrigin-RevId: 206852642

* Add original post-processing back.

02a9969e

02 Jul, 2018 1 commit

Open Images Challenge 2018 tools, minor fixes and refactors. (#4661) · 32e7d660

pkulzc authored Jul 02, 2018

* Merged commit includes the following changes:
202804536 by Zhichao Lu:

Return tf.data.Dataset from input_fn that goes into the estimator and use PER_HOST_V2 option for tpu input pipeline config.

This change shaves off 100ms per step resulting in 25 minutes of total reduced training time for ssd mobilenet v1 (15k steps to convergence).

--
202769340 by Zhichao Lu:

Adding as_matrix() transformation for image-level labels.

--
202768721 by Zhichao Lu:

Challenge evaluation protocol modification: adding labelmaps creation.

--
202750966 by Zhichao Lu:

Add the explicit names to two output nodes.

--
202732783 by Zhichao Lu:

Enforcing that batch size is 1 for evaluation, and no original images are retained during evaluation when use_tpu=False (to avoid dynamic shapes).

--
202425430 by Zhichao Lu:

Refactor input pipeline to improve performance.

--
202406389 by Zhichao Lu:

Only check the validity of `warmup_learning_rate` if it will be used.

--
202330450 by Zhichao Lu:

Adding the description of the flag input_image_label_annotations_csv to add
image-level labels to tf.Example.

--
202029012 by Zhichao Lu:

Enabling displaying relationship name in the final metrics output.

--
202024010 by Zhichao Lu:

Update to the public README.

--
201999677 by Zhichao Lu:

Fixing the way negative labels are handled in VRD evaluation.

--
201962313 by Zhichao Lu:

Fix a bug in resize_to_range.

--
201808488 by Zhichao Lu:

Update ssd_inception_v2_pets.config to use right filename of pets dataset tf records.

--
201779225 by Zhichao Lu:

Update object detection API installation doc

--
201766518 by Zhichao Lu:

Add shell script to create pycocotools package for CMLE.

--
201722377 by Zhichao Lu:

Removes verified_labels field and uses groundtruth_image_classes field instead.

--
201616819 by Zhichao Lu:

Disable eval_on_tpu since eval_metrics is not setup to execute on TPU.
Do not use run_config.task_type to switch tpu mode for EVAL,
since that won't work in unit test.
Expand unit test to verify that the same instantiation of the Estimator can independently disable eval on TPU whereas training is enabled on TPU.

--
201524716 by Zhichao Lu:

Disable export model to TPU, inference is not compatible with TPU.
Add GOOGLE_INTERNAL support in object detection copy.bara.sky

--
201453347 by Zhichao Lu:

Fixing bug when evaluating the quantized model.

--
200795826 by Zhichao Lu:

Fixing parsing bug: image-level labels are parsed as tuples instead of numpy
array.

--
200746134 by Zhichao Lu:

Adding image_class_text and image_class_label fields into tf_example_decoder.py

--
200743003 by Zhichao Lu:

Changes to model_main.py and model_tpu_main to enable training and continuous eval.

--
200736324 by Zhichao Lu:

Replace deprecated squeeze_dims argument.

--
200730072 by Zhichao Lu:

Make detections only during predict and eval mode while creating model function

--
200729699 by Zhichao Lu:

Minor correction to internal documentation (definition of Huber loss)

--
200727142 by Zhichao Lu:

Add command line parsing as a set of flags using argparse and add header to the
resulting file.

--
200726169 by Zhichao Lu:

A tutorial on running evaluation for the Open Images Challenge 2018.

--
200665093 by Zhichao Lu:

Cleanup on variables_helper_test.py.

--
200652145 by Zhichao Lu:

Add an option to write (non-frozen) graph when exporting inference graph.

--
200573810 by Zhichao Lu:

Update ssd_mobilenet_v1_coco and ssd_inception_v2_coco download links to point to a newer version.

--
200498014 by Zhichao Lu:

Add test for groundtruth mask resizing.

--
200453245 by Zhichao Lu:

Cleaning up exporting_models.md along with exporting scripts

--
200311747 by Zhichao Lu:

Resize groundtruth mask to match the size of the original image.

--
200287269 by Zhichao Lu:

Having a option to use custom MatMul based crop_and_resize op as an alternate to the TF op in Faster-RCNN

--
200127859 by Zhichao Lu:

Updating the instructions to run locally with new binary. Also updating pets configs since file path naming has changed.

--
200127044 by Zhichao Lu:

A simpler evaluation util to compute Open Images Challenge
2018 metric (object detection track).

--
200124019 by Zhichao Lu:

Freshening up configuring_jobs.md

--
200086825 by Zhichao Lu:

Make merge_multiple_label_boxes work for ssd model.

--
199843258 by Zhichao Lu:

Allows inconsistent feature channels to be compatible with WeightSharedConvolutionalBoxPredictor.

--
199676082 by Zhichao Lu:

Enable an override for `InputReader.shuffle` for object detection pipelines.

--
199599212 by Zhichao Lu:

Markdown fixes.

--
199535432 by Zhichao Lu:

Pass num_additional_channels to tf.example decoder in predict_input_fn.

--
199399439 by Zhichao Lu:

Adding `num_additional_channels` field to specify how many additional channels to use in the model.

PiperOrigin-RevId: 202804536

* Add original model builder and docs back.

32e7d660

03 Apr, 2018 1 commit
- Provide option to perform in-place batch norm updates for ssd feature extractors. · d2c5bfac
  Zhichao Lu authored Mar 27, 2018
```
PiperOrigin-RevId: 190688309
```
  d2c5bfac
01 Feb, 2018 1 commit

Merged commit includes the following changes: · 7a9934df

Zhichao Lu authored Jan 31, 2018

184048729  by Zhichao Lu:

    Modify target_assigner so that it creates regression targets taking keypoints into account.

--
184027183  by Zhichao Lu:

    Resnet V1 FPN based feature extractors for SSD meta architecture in Object Detection V2 API.

--
184004730  by Zhichao Lu:

    Expose a lever to override the configured mask_type.

--
183933113  by Zhichao Lu:

    Weight shared convolutional box predictor as described in https://arxiv.org/abs/1708.02002

--
183929669  by Zhichao Lu:

    Expanding box list operations for future data augmentations.

--
183916792  by Zhichao Lu:

    Fix unrecognized assertion function in tests.

--
183906851  by Zhichao Lu:

    - Change ssd meta architecture to use regression weights to compute loss normalizer.

--
183871003  by Zhichao Lu:

    Fix config_util_test wrong dependency.

--
183782120  by Zhichao Lu:

    Add __init__ file to third_party directories.

--
183779109  by Zhichao Lu:

    Setup regular version s...

7a9934df

27 Oct, 2017 1 commit
- update protos. · 9adf0242
  Vivek Rathod authored Oct 27, 2017
  
  9adf0242
21 Sep, 2017 1 commit
- Move the research models into a research subfolder (#2430) · f87a58cd
  Neal Wu authored Sep 21, 2017
  
  f87a58cd
15 Jun, 2017 1 commit

Add Tensorflow Object Detection API. (#1561) · a4944a57

derekjchow authored Jun 14, 2017

For details see our paper:
"Speed/accuracy trade-offs for modern convolutional object detectors."
Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I,
Wojna Z, Song Y, Guadarrama S, Murphy K, CVPR 2017
https://arxiv.org/abs/1611.10012

a4944a57