Commits · b968a6ce96789e1ffc2100b522a3bfaea45f26be · ModelZoo / ResNet50_tensorflow

13 Nov, 2019 1 commit

Merged commit includes the following changes: (#7800) · b968a6ce

Mark Sandler authored Nov 13, 2019

280142968  by Zhichao Lu:

    Opensource MobilenetEdgeTPU + ssdlite into third-party object detection APIs on EdgeTPU.

--
280134001  by Zhichao Lu:

    Adds MobilenetEdgeTpu + ssdlite into internal object detection APIs on EdgeTPU.

--
278941778  by Zhichao Lu:

    Add support for fixed input shapes for 'encoded_image_string_tensor' and 'tf_example' inputs.

--
278933274  by Zhichao Lu:

      Adding fool proof check to avoid using 1x1 depthwise conv op.

--
278762192  by Zhichao Lu:

    Ensure correct number of iterations after training resumes.

--
278746440  by Zhichao Lu:

    Internal change.

--
278006953  by Zhichao Lu:

    Internal changes to tf.contrib symbols

--
278006330  by Zhichao Lu:

    Internal changes to tf.contrib symbols

--
277593959  by Zhichao Lu:

      Make the ssd_feature_extractor_test.py PY3 compatible. The "six.zip" will use "itertools.izip" in Python 2 and "zip" in Python 3.

--
277344551  by Zhichao Lu:

    Internal change.

--
277154953  by Zhichao Lu:

    Conditionally use keras based optimizers so that check-pointing works correctly.
    This change also enables summaries on TPU which were previously not enabled
    due to a bug.

--
277087572  by Zhichao Lu:

    Fix resizing boxes when using keep_aspect_ratio_rezier with padding.

--
275898543  by Zhichao Lu:

    Support label_map_proto as input in label_map_util.

--
275347137  by Zhichao Lu:

    Add force_no_resize flag in eval.proto which replaces
    the resize config with identity resizer. This is useful
    when we want to test at the original image resolution.

--

PiperOrigin-RevId: 280142968

b968a6ce

31 May, 2019 1 commit

Merged commit includes the following changes: (#6932) · 9bbf8015

pkulzc authored May 30, 2019

250447559 by Zhichao Lu:

Update expected files format for Instance Segmentation challenge:
- add fields ImageWidth, ImageHeight and store the values per prediction
- as mask, store only encoded image and assume its size is ImageWidth x ImageHeight

--
250402780 by rathodv:

Fix failing Mask R-CNN TPU convergence test.

Cast second stage prediction tensors from bfloat16 to float32 to prevent errors in third target assignment (Mask Prediction) - Concat with different types bfloat16 and bfloat32 isn't allowed.

--
250300240 by Zhichao Lu:

Addion Open Images Challenge 2019 object detection and instance segmentation
support into Estimator framework.

--
249944839 by rathodv:

Modify exporter.py to add multiclass score nodes in exported inference graphs.

--
249935201 by rathodv:

Modify postprocess methods to preserve multiclass scores after non max suppression.

--
249878079 by Zhichao Lu:

This CL slightly refactors some Object Detection helper functions for data creation, evaluation, and groundtruth providing.

This will allow the eager+function custom loops to share code with the existing estimator training loops.

Concretely we make the following changes:
1. In input creation we separate dataset-creation into top-level helpers, and allow it to optionally accept a pre-constructed model directly instead of always creating a model from the config just for feature preprocessing.

2. In coco evaluation we split the update_op creation into its own function, which the custom loops will call directly.

3. In model_lib we move groundtruth providing/ datastructure munging into a helper function

4. For now we put an escape hatch in `_summarize_target_assignment` when executing in tf v2.0 behavior because the summary apis used only work w/ tf 1.x

--
249673507 by rathodv:

Use explicit casts instead of tf.to_float and tf.to_int32 to avoid warnings.

--
249656006 by Zhichao Lu:

Add named "raw_keypoint_locations" node that corresponds with the "raw_box_locations" node.

--
249651674 by rathodv:

Keep proposal boxes in float format. MatMulCropAndResize can handle the type even when feature themselves are bfloat16s.

--
249568633 by rathodv:

Support q > 1 in class agnostic NMS.
Break post_processing_test.py into 3 separate files to avoid linter errors.

--
249535530 by rathodv:

Update some deprecated arguments to tf ops.

--
249368223 by rathodv:

Modify MatMulCropAndResize to use MultiLevelRoIAlign method and move the tests to spatial_transform_ops.py module.

This cl establishes that CropAndResize and RoIAlign are equivalent and only differ in the sampling point grid within the boxes. CropAndResize uses a uniform size x size point grid such that the corner points exactly overlap box corners, while RoiAlign divides boxes into size x size cells and uses their centers as sampling points. In this cl, we switch MatMulCropAndResize to use the MultiLevelRoIAlign implementation with `align_corner` option as MultiLevelRoIAlign implementation is more memory efficient on TPU when compared to the original MatMulCropAndResize.

--
249337338 by chowdhery:

Add class-agnostic non-max-suppression in post_processing

--
249139196 by Zhichao Lu:

Fix positional argument bug in export_tflite_ssd_graph

--
249120219 by Zhichao Lu:

Add evaluator for computing precision limited to a given recall range.

--
249030593 by Zhichao Lu:

Evaluation util to run segmentation and detection challenge evaluation.

--
248554358 by Zhichao Lu:

This change contains the auxiliary changes required for TF 2.0 style training with eager+functions+dist strat loops, but not the loops themselves.

It includes:
- Updates to shape usage to support both tensorshape v1 and tensorshape v2
- A fix to FreezableBatchNorm to not override the `training` arg in call when `None` was passed to the constructor (Not an issue in the estimator loops but it was in the custom loops)
- Puts some constants in init_scope so they work in eager + functions
- Makes learning rate schedules return a callable in eager mode (required so they update when the global_step changes)
- Makes DetectionModel a tf.module so it tracks variables (e.g. ones nested in layers)
- Removes some references to `op.name` for some losses and replaces it w/ explicit names
- A small part of the change to allow the coco evaluation metrics to work in eager mode

--
248271226 by rathodv:

Add MultiLevel RoIAlign op.

--
248229103 by rathodv:

Add functions to 1. pad features maps 2. ravel 5-D indices

--
248206769 by rathodv:

Add utilities needed to introduce RoI Align op.

--
248177733 by pengchong:

Internal changes

--
247742582 by Zhichao Lu:

Open Images Challenge 2019 instance segmentation metric: part 2

--
247525401 by Zhichao Lu:

Update comments on max_class_per_detection.

--
247520753 by rathodv:

Add multilevel crop and resize operation that builds on top of matmul_crop_and_resize.

--
247391600 by Zhichao Lu:

Open Images Challenge 2019 instance segmentation metric

--
247325813 by chowdhery:

Quantized MobileNet v2 SSD FPNLite config with depth multiplier 0.75

PiperOrigin-RevId: 250447559

9bbf8015

30 Nov, 2018 1 commit

Merged commit includes the following changes: · a1337e01

Zhichao Lu authored Nov 27, 2018

223075771 by lzc:

Bring in external fixes.

--
222919755 by ronnyvotel:

Bug fix in faster r-cnn model builder. Was previously using `inplace_batchnorm_update` for `reuse_weights`.

--
222885680 by Zhichao Lu:

Use the result_dict_for_batched_example in models_lib
Also fixes the visualization size on when eval is on GPU

--
222883648 by Zhichao Lu:

Fix _unmatched_class_label for the _add_background_class == False case in ssd_meta_arch.py.

--
222836663 by Zhichao Lu:

Adding support for visualizing grayscale images. Without this change, the images are black-red instead of grayscale.

--
222501978 by Zhichao Lu:

Fix a bug that caused convert_to_grayscale flag not to be respected.

--
222432846 by richardmunoz:

Fix mapping of groundtruth_confidences from shape [num_boxes] to [num_boxes, num_classes] when the input contains the groundtruth_confidences field.

--
221725755 by richardmunoz:

Internal change.

--
221458536 by Zhichao Lu:

Fix saver defer build bug in object detection train codepath.

--
221391590 by Zhichao Lu:

Add support for group normalization in the object detection API. Just adding MobileNet-v1 SSD currently. This may serve as a road map for other models that wish to support group normalization as an option.

--
221367993 by Zhichao Lu:

Bug fixes (1) Make RandomPadImage work, (2) Fix keep_checkpoint_every_n_hours.

--
221266403 by rathodv:

Use detection boxes as proposals to compute correct mask loss in eval jobs.

--
220845934 by lzc:

Internal change.

--
220778850 by Zhichao Lu:

Incorporating existing metrics into Estimator framework.
Should restore:
-oid_challenge_detection_metrics
-pascal_voc_detection_metrics
-weighted_pascal_voc_detection_metrics
-pascal_voc_instance_segmentation_metrics
-weighted_pascal_voc_instance_segmentation_metrics
-oid_V2_detection_metrics

--
220370391 by alirezafathi:

Adding precision and recall to the metrics.

--
220321268 by Zhichao Lu:

Allow the option of setting max_examples_to_draw to zero.

--
220193337 by Zhichao Lu:

This CL fixes a bug where the Keras convolutional box predictor was applying heads in the non-deterministic dict order. The consequence of this bug was that variables were created in non-deterministic orders. This in turn led different workers in a multi-gpu training setup to have slightly different graphs which had variables assigned to mismatched parameter servers. As a result, roughly half of all workers were unable to initialize and did no work, and training time was slowed down approximately 2x.

--
220136508 by huizhongc:

Add weight equalization loss to SSD meta arch.

--
220125875 by pengchong:

Rename label_scores to label_weights

--
219730108 by Zhichao Lu:

Add description of detection_keypoints in postprocessed_tensors to docstring.

--
219577519 by pengchong:

Support parsing the class confidences and training using them.

--
219547611 by lzc:

Stop using static shapes in GPU eval jobs.

--
219536476 by Zhichao Lu:

Migrate TensorFlow Lite out of tensorflow/contrib

This change moves //tensorflow/contrib/lite to //tensorflow/lite in preparation
for TensorFlow 2.0's deprecation of contrib/. If you refer to TF Lite build
targets or headers, you will need to update them manually. If you use TF Lite
from the TensorFlow python package, "tf.contrib.lite" now points to "tf.lite".
Please update your imports as soon as possible.

For more details, see https://groups.google.com/a/tensorflow.org/forum/#!topic/tflite/iIIXOTOFvwQ

@angersson and @aselle are conducting this migration. Please contact them if
you have any further questions.

--
219190083 by Zhichao Lu:

Add a second expected_loss_weights function using an alternative expectation calculation compared to previous. Integrate this op into ssd_meta_arch and losses builder. Affects files that use losses_builder.build to handle the returning of an additional element.

--
218924451 by pengchong:

Add a new way to assign training targets using groundtruth confidences.

--
218760524 by chowdhery:

Modify export script to add option for regular NMS in TFLite post-processing op.

PiperOrigin-RevId: 223075771

a1337e01

02 Nov, 2018 1 commit

Minor fixes for object detection (#5613) · 31ae57eb

pkulzc authored Nov 02, 2018

* Internal change.

PiperOrigin-RevId: 213914693

* Add original_image_spatial_shape tensor in input dictionary to store shape of the original input image

PiperOrigin-RevId: 214018767

* Remove "groundtruth_confidences" from decoders use "groundtruth_weights" to indicate label confidence.

This also solves a bug that only surfaced now - random crop routines in core/preprocessor.py did not correctly handle "groundtruth_weight" tensors returned by the decoders.

PiperOrigin-RevId: 214091843

* Update CocoMaskEvaluator to allow for a batch of image info, rather than a single image.

PiperOrigin-RevId: 214295305

* Adding the option to be able to summarize gradients.

PiperOrigin-RevId: 214310875

* Adds FasterRCNN inference on CPU

1. Adds a flag use_static_shapes_for_eval to restrict to the ops that guarantees static shape.
2. No filtering of overlapping anchors while clipping the anchors when use_static_shapes_for_eval is set to True.
3. Adds test for faster_rcnn_meta_arch for predict and postprocess in inference mode for first and second stages.

PiperOrigin-RevId: 214329565

* Fix model_lib eval_spec_names assignment (integer->string).

PiperOrigin-RevId: 214335461

* Refactor Mask HEAD to optionally upsample after applying convolutions on ROI crops.

PiperOrigin-RevId: 214338440

* Uses final_exporter_name as exporter_name for the first eval spec for backward compatibility.

PiperOrigin-RevId: 214522032

* Add reshaped `mask_predictions` tensor to the prediction dictionary in `_predict_third_stage` method to allow computing mask loss in eval job.

PiperOrigin-RevId: 214620716

* Add support for fully conv training to fpn.

PiperOrigin-RevId: 214626274

* Fix the proprocess() function in Resnet v1 to make it work for any number of input channels.

Note: If the #channels != 3, this will simply skip the mean subtraction in preprocess() function.
PiperOrigin-RevId: 214635428

* Wrap result_dict_for_single_example in eval_util to run for batched examples.

PiperOrigin-RevId: 214678514

* Adds PNASNet-based (ImageNet model) feature extractor for SSD.

PiperOrigin-RevId: 214988331

* Update documentation

PiperOrigin-RevId: 215243502

* Correct index used to compute number of groundtruth/detection boxes in COCOMaskEvaluator.

Due to an incorrect indexing in cl/214295305 only the first detection mask and first groundtruth mask for a given image are fed to the COCO Mask evaluation library. Since groundtruth masks are arranged in no particular order, the first and highest scoring detection mask (detection masks are ordered by score) won't match the the first and only groundtruth retained in all cases. This is I think why mask evaluation metrics do not get better than ~11 mAP. Note that this code path is only active when using model_main.py binary for evaluation.

This change fixes the indices and modifies an existing test case to cover it.

PiperOrigin-RevId: 215275936

* Fixing grayscale_image_resizer to accept mask as input.

PiperOrigin-RevId: 215345836

* Add an option not to clip groundtruth boxes during preprocessing. Clipping boxes adversely affects training for partially occluded or large objects, especially for fully conv models. Clipping already occurs during postprocessing, and should not occur during training.

PiperOrigin-RevId: 215613379

* Always return recalls and precisions with length equal to the number of classes.

The previous behavior of ObjectDetectionEvaluation was somewhat dangerous: when no groundtruth boxes were present, the lists of per-class precisions and recalls were simply truncated. Unless you were aware of this phenomenon (and consulted the `num_gt_instances_per_class` vector) it was difficult to associate each metric with each class.

PiperOrigin-RevId: 215633711

* Expose the box feature node in SSD.

PiperOrigin-RevId: 215653316

* Fix ssd mobilenet v2 _CONV_DEFS overwriting issue.

PiperOrigin-RevId: 215654160

* More documentation updates

PiperOrigin-RevId: 215656580

* Add pooling + residual option in multi_resolution_feature_maps. It adds an average pooling and a residual layer between feature maps with matching depth. Designed to be used with WeightSharedBoxPredictor.

PiperOrigin-RevId: 215665619

* Only call create_modificed_mobilenet_config on init if use_depthwise is true.

PiperOrigin-RevId: 215784290

* Only call create_modificed_mobilenet_config on init if use_depthwise is true.

PiperOrigin-RevId: 215837524

* Don't prune keypoints if clip_boxes is false.

PiperOrigin-RevId: 216187642

* Makes sure "key" field exists in the result dictionary.

PiperOrigin-RevId: 216456543

* Add add_background_class parameter to allow disabling the inclusion of a background class.

PiperOrigin-RevId: 216567612

* Update expected_classification_loss_under_sampling to better account for expected sampling.

PiperOrigin-RevId: 216712287

* Let the evaluation receive a evaluation class in its constructor.

PiperOrigin-RevId: 216769374

* This CL adds model building & training support for end-to-end Keras-based SSD models. If a Keras feature extractor's name is specified in the model config (e.g. 'ssd_mobilenet_v2_keras'), the model will use that feature extractor and a corresponding Keras-based box predictor.

This CL makes sure regularization losses & batch norm updates work correctly when training models that have Keras-based components. It also updates the default hyperparameter settings of the keras-based mobilenetV2 (when not overriding hyperparams) to more closely match the legacy Slim training scope.

PiperOrigin-RevId: 216938707

* Adding the ability in the coco evaluator to indicate whether an image has been annotated. For a non-annotated image, detections and groundtruth are not supplied.

PiperOrigin-RevId: 217316342

* Release the 8k minival dataset ids for MSCOCO, used in Huang et al. "Speed/accuracy trade-offs for modern convolutional object detectors" (https://arxiv.org/abs/1611.10012)

PiperOrigin-RevId: 217549353

* Exposes weighted_sigmoid_focal loss for faster rcnn classifier

PiperOrigin-RevId: 217601740

* Add detection_features to output nodes. The shape of the feature is [batch_size, max_detections, depth].

PiperOrigin-RevId: 217629905

* FPN uses a custom NN resize op for TPU-compatibility. Replace this op with the Tensorflow version at export time for TFLite-compatibility.

PiperOrigin-RevId: 217721184

* Compute `num_groundtruth_boxes` in inputs.tranform_input_data_fn after data augmentation instead of decoders.

PiperOrigin-RevId: 217733432

* 1. Stop gradients from flowing into groundtruth masks with zero paddings.
2. Normalize pixelwise cross entropy loss across the whole batch.

PiperOrigin-RevId: 217735114

* Optimize Input pipeline for Mask R-CNN on TPU with blfoat16: improve the step time from:
1663.6 ms -> 1184.2 ms, about 28.8% improvement.

PiperOrigin-RevId: 217748833

* Fixes to export a TPU compatible model

Adds nodes to each of the output tensor. Also increments the value of class labels by 1.

PiperOrigin-RevId: 217856760

* API changes:
 - change the interface of target assigner to return per-class weights.
 - change the interface of classification loss to take per-class weights.

PiperOrigin-RevId: 217968393

* Add an option to override pipeline config in export_saved_model using command line arg

PiperOrigin-RevId: 218429292

* Include Quantized trained MobileNet V2 SSD and FaceSsd in model zoo.

PiperOrigin-RevId: 218530947

* Write final config to disk in `train` mode only.

PiperOrigin-RevId: 218735512

31ae57eb

21 Sep, 2018 1 commit

Release iNaturalist Species-trained models, refactor of evaluation, box... · 99256cf4

pkulzc authored Sep 21, 2018

Release iNaturalist Species-trained models, refactor of evaluation, box predictor for object detection. (#5289)

* Merged commit includes the following changes:
212389173  by Zhichao Lu:

    1. Replace tf.boolean_mask with tf.where

--
212282646  by Zhichao Lu:

    1. Fix a typo in model_builder.py and add a test to cover it.

--
212142989  by Zhichao Lu:

    Only resize masks in meta architecture if it has not already been resized in the input pipeline.

--
212136935  by Zhichao Lu:

    Choose matmul or native crop_and_resize in the model builder instead of faster r-cnn meta architecture.

--
211907984  by Zhichao Lu:

    Make eval input reader repeated field and update config util to handle this field.

--
211858098  by Zhichao Lu:

    Change the implementation of merge_boxes_with_multiple_labels.

--
211843915  by Zhichao Lu:

    Add Mobilenet v2 + FPN support.

--
211655076  by Zhichao Lu:

    Bug fix for generic keys in config overrides

    In generic configuration overrides, we had a duplicate entry for train_input_config and we were missing the eval_input_config and eval_config.

    This change also introduces testing for all config overrides.

--
211157501  by Zhichao Lu:

    Make the locally-modified conv defs a copy.

    So that it doesn't modify MobileNet conv defs globally for other code that
    transitively imports this package.

--
211112813  by Zhichao Lu:

    Refactoring visualization tools for Estimator's eval_metric_ops. This will make it easier for future models to take advantage of a single interface and mechanics.

--
211109571  by Zhichao Lu:

    A test decorator.

--
210747685  by Zhichao Lu:

    For FPN, when use_depthwise is set to true, use slightly modified mobilenet v1 config.

--
210723882  by Zhichao Lu:

    Integrating the losses mask into the meta architectures. When providing groundtruth, one can optionally specify annotation information (i.e. which images are labeled vs. unlabeled). For any image that is unlabeled, there is no loss accumulation.

--
210673675  by Zhichao Lu:

    Internal change.

--
210546590  by Zhichao Lu:

    Internal change.

--
210529752  by Zhichao Lu:

    Support batched inputs with ops.matmul_crop_and_resize.

    With this change the new inputs are images of shape [batch, heigh, width, depth] and boxes of shape [batch, num_boxes, 4]. The output tensor is of the shape [batch, num_boxes, crop_height, crop_width, depth].

--
210485912  by Zhichao Lu:

    Fix TensorFlow version check in object_detection_tutorial.ipynb

--
210484076  by Zhichao Lu:

    Reduce TPU memory required for single image matmul_crop_and_resize.

    Using tf.einsum eliminates intermediate tensors, tiling and expansion. for an image of size [40, 40, 1024] and boxes of shape [300, 4] HBM memory usage goes down from 3.52G to 1.67G.

--
210468361  by Zhichao Lu:

    Remove PositiveAnchorLossCDF/NegativeAnchorLossCDF to resolve "Main thread is not in main loop error" issue in local training.

--
210100253  by Zhichao Lu:

    Pooling pyramid feature maps: add option to replace max pool with convolution layers.

--
209995842  by Zhichao Lu:

    Fix a bug which prevents variable sharing in Faster RCNN.

--
209965526  by Zhichao Lu:

    Add support for enabling export_to_tpu through the estimator.

--
209946440  by Zhichao Lu:

    Replace deprecated tf.train.Supervisor with tf.train.MonitoredSession. MonitoredSession also takes away the hassle of starting queue runners.

--
209888003  by Zhichao Lu:

    Implement function to handle data where source_id is not set.

    If the field source_id is found to be the empty string for any image during runtime, it will be replaced with a random string. This avoids hash-collisions on dataset where many examples do not have source_id set. Those hash-collisions have unintended site effects and may lead to bugs in the detection pipeline.

--
209842134  by Zhichao Lu:

    Converting loss mask into multiplier, rather than using it as a boolean mask (which changes tensor shape). This is necessary, since other utilities (e.g. hard example miner) require a loss matrix with the same dimensions as the original prediction tensor.

--
209768066  by Zhichao Lu:

    Adding ability to remove loss computation from specific images in a batch, via an optional boolean mask.

--
209722556  by Zhichao Lu:

    Remove dead code.

    (_USE_C_API was flipped to True by default in TensorFlow 1.8)

--
209701861  by Zhichao Lu:

    This CL cleans-up some tf.Example creation snippets, by reusing the convenient tf.train.Feature building functions in dataset_util.

--
209697893  by Zhichao Lu:

    Do not overwrite num_epoch for eval input. This leads to errors in some cases.

--
209694652  by Zhichao Lu:

    Sample boxes by jittering around the currently given boxes.

--
209550300  by Zhichao Lu:

    `create_category_index_from_labelmap()` function now accepts `use_display_name` parameter.
    Also added create_categories_from_labelmap function for convenience

--
209490273  by Zhichao Lu:

    Check result_dict type before accessing image_id via key.

--
209442529  by Zhichao Lu:

    Introducing the capability to sample examples for evaluation. This makes it easy to specify one full epoch of evaluation, or a subset (e.g. sample 1 of every N examples).

--
208941150  by Zhichao Lu:

    Adding the capability of exporting the results in json format.

--
208888798  by Zhichao Lu:

    Fixes wrong dictionary key for num_det_boxes_per_image.

--
208873549  by Zhichao Lu:

    Reduce the number of HLO ops created by matmul_crop_and_resize.

    Do not unroll along the channels dimension. Instead, transpose the input image dimensions, apply tf.matmul and transpose back.

    The number of HLO instructions for 1024 channels reduce from 12368 to 110.

--
208844315  by Zhichao Lu:

    Add an option to use tf.non_maximal_supression_padded in SSD post-process

--
208731380  by Zhichao Lu:

    Add field in box_predictor config to enable mask prediction and update builders accordingly.

--
208699405  by Zhichao Lu:

    This CL creates a keras-based multi-resolution feature map extractor.

--
208557208  by Zhichao Lu:

    Add TPU tests for Faster R-CNN Meta arch.

    * Tests that two_stage_predict and total_loss tests run successfully on TPU.
    * Small mods to multiclass_non_max_suppression to preserve static shapes.

--
208499278  by Zhichao Lu:

    This CL makes sure the Keras convolutional box predictor & head layers apply activation layers *after* normalization (as opposed to before).

--
208391694  by Zhichao Lu:

    Updating visualization tool to produce multiple evaluation images.

--
208275961  by Zhichao Lu:

    This CL adds a Keras version of the Convolutional Box Predictor, as well as more general infrastructure for making Keras Prediction heads & Keras box predictors.

--
208275585  by Zhichao Lu:

    This CL enables the Keras layer hyperparameter object to build a dedicated activation layer, and to disable activation by default in the op layer construction kwargs.

    This is necessary because in most cases the normalization layer must be applied before the activation layer. So, in Keras models we must set the convolution activation in a dedicated layer after normalization is applied, rather than setting it in the convolution layer construction args.

--
208263792  by Zhichao Lu:

    Add a new SSD mask meta arch that can predict masks for SSD models.
    Changes including:
     - overwrite loss function to add mask loss computation.
     - update ssd_meta_arch to handle masks if predicted in predict and postprocessing.

--
208000218  by Zhichao Lu:

    Make FasterRCNN choose static shape operations only in training mode.

--
207997797  by Zhichao Lu:

    Add static boolean_mask op to box_list_ops.py and use that in faster_rcnn_meta_arch.py to support use_static_shapes option.

--
207993460  by Zhichao Lu:

    Include FGVC detection models in model zoo.

--
207971213  by Zhichao Lu:

    remove the restriction to run tf.nn.top_k op on CPU

--
207961187  by Zhichao Lu:

    Build the first stage NMS function in the model builder and pass it to FasterRCNN meta arch.

--
207960608  by Zhichao Lu:

    Internal Change.

--
207927015  by Zhichao Lu:

    Have an option to use the TPU compatible NMS op cl/206673787, in the batch_multiclass_non_max_suppression function. On setting pad_to_max_output_size to true, the output nmsed boxes are padded to be of length max_size_per_class.

    This can be used in first stage Region Proposal Network in FasterRCNN model by setting the first_stage_nms_pad_to_max_proposals field to true in config proto.

--
207809668  by Zhichao Lu:

    Add option to use depthwise separable conv instead of conv2d in FPN and WeightSharedBoxPredictor. More specifically, there are two related configs:
    - SsdFeatureExtractor.use_depthwise
    - WeightSharedConvolutionalBoxPredictor.use_depthwise

--
207808651  by Zhichao Lu:

    Fix the static balanced positive negative sampler's TPU tests

--
207798658  by Zhichao Lu:

    Fixes a post-refactoring bug where the pre-prediction convolution layers in the convolutional box predictor are ignored.

--
207796470  by Zhichao Lu:

    Make slim endpoints visible in FasterRCNNMetaArch.

--
207787053  by Zhichao Lu:

    Refactor ssd_meta_arch so that the target assigner instance is passed into the SSDMetaArch constructor rather than constructed inside.

--

PiperOrigin-RevId: 212389173

* Fix detection model zoo typo.

* Modify tf example decoder to handle label maps with either `display_name` or `name` fields seamlessly.

Currently, tf example decoder uses only `name` field to look up ids for class text field present in the data. This change uses both `display_name` and `name` fields in the label map to fetch ids for class text.

PiperOrigin-RevId: 212672223

* Modify create_coco_tf_record tool to write out class text instead of class labels.

PiperOrigin-RevId: 212679112

* Fix detection model zoo typo.

PiperOrigin-RevId: 212715692

* Adding the following two optional flags to WeightSharedConvolutionalBoxHead:
1) In the box head, apply clipping to box encodings in the box head.
2) In the class head, apply sigmoid to class predictions at inference time.

PiperOrigin-RevId: 212723242

* Support class confidences in merge boxes with multiple labels.

PiperOrigin-RevId: 212884998

* Creates multiple eval specs for object detection.

PiperOrigin-RevId: 212894556

* Set batch_norm on last layer in Mask Head to None.

PiperOrigin-RevId: 213030087

* Enable bfloat16 training for object detection models.

PiperOrigin-RevId: 213053547

* Skip padding op when unnecessary.

PiperOrigin-RevId: 213065869

* Modify `Matchers` to use groundtruth weights before performing matching.

Groundtruth weights tensor is used to indicate padding in groundtruth box tensor. It is handled in `TargetAssigner` by creating appropriate classification and regression target weights based on the groundtruth box each anchor matches to. However, options such as `force_match_all_rows` in `ArgmaxMatcher` force certain anchors to match to groundtruth boxes that are just paddings thereby reducing the number of anchors that could otherwise match to real groundtruth boxes.

For single stage models like SSD the effect of this is negligible as there are two orders of magnitude more anchors than the number of padded groundtruth boxes. But for Faster R-CNN and Mask R-CNN where there are only 300 anchors in the second stage, a significant number of these match to groundtruth paddings reducing the number of anchors regressing to real groundtruth boxes degrading the performance severely.

Therefore, this change introduces an additional boolean argument `valid_rows` to `Matcher.match` methods and the implementations now ignore such padded groudtruth boxes during matching.

PiperOrigin-RevId: 213345395

* Add release note for iNaturalist Species trained models.

PiperOrigin-RevId: 213347179

* Fix the bug of uninitialized gt_is_crowd_list variable.

PiperOrigin-RevId: 213364858

* ...text exposed to open source public git repo...

PiperOrigin-RevId: 213554260

99256cf4

08 Aug, 2018 1 commit

Update object detection post processing and fixes boxes padding/clipping issue. (#5026) · 59f7e80a

pkulzc authored Aug 07, 2018

* Merged commit includes the following changes:
207771702 by Zhichao Lu:

Refactoring evaluation utilities so that it is easier to introduce new DetectionEvaluators with eval_metric_ops.

--
207758641 by Zhichao Lu:

Require tensorflow version 1.9+ for running object detection API.

--
207641470 by Zhichao Lu:

Clip `num_groundtruth_boxes` in pad_input_data_to_static_shapes() to `max_num_boxes`. This prevents a scenario where tensors are sliced to an invalid range in model_lib.unstack_batch().

--
207621728 by Zhichao Lu:

This CL adds a FreezableBatchNorm that inherits from the Keras BatchNormalization layer, but supports freezing the `training` parameter at construction time instead of having to do it in the `call` method.

It also adds a method to the `KerasLayerHyperparams` class that will build an appropriate FreezableBatchNorm layer according to the hyperparameter configuration. If batch_norm is disabled, this method returns and Identity layer.

These will be used to simplify the conversion to Keras APIs.

--
207610524 by Zhichao Lu:

Update anchor generators and box predictors for python3 compatibility.

--
207585122 by Zhichao Lu:

Refactoring convolutional box predictor into separate prediction heads.

--
207549305 by Zhichao Lu:

Pass all 1s for batch weights if nothing is specified in GT.

--
207336575 by Zhichao Lu:

Move the new argument 'target_assigner_instance' to the end of the list of arguments to the ssd_meta_arch constructor for backwards compatibility.

--
207327862 by Zhichao Lu:

Enable support for float output in quantized custom op for postprocessing in SSD Mobilenet model.

--
207323154 by Zhichao Lu:

Bug fix: change dict.iteritems() to dict.items()

--
207301109 by Zhichao Lu:

Integrating expected_classification_loss_under_sampling op as an option in the ssd_meta_arch

--
207286221 by Zhichao Lu:

Adding an option to weight regression loss with foreground scores from the ground truth labels.

--
207231739 by Zhichao Lu:

Explicitly mentioning the argument names when calling the batch target assigner.

--
207206356 by Zhichao Lu:

Add include_trainable_variables field to train config to better handle trainable variables.

--
207135930 by Zhichao Lu:

Internal change.

--
206862541 by Zhichao Lu:

Do not unpad the outputs from batch_non_max_suppression before sampling.

Since BalancedPositiveNegativeSampler takes an indicator for valid positions to sample from we can pass the output from NMS directly into Sampler.

PiperOrigin-RevId: 207771702

* Remove unused doc.

59f7e80a

13 Jul, 2018 1 commit

Object detection Internal Changes. (#4757) · 70255908

pkulzc authored Jul 12, 2018

* Merged commit includes the following changes:
204316992 by Zhichao Lu:

Update docs to prepare inputs

--
204309254 by Zhichao Lu:

Update running_pets.md to use new binaries and correct a few things in running_on_cloud.md

--
204306734 by Zhichao Lu:

Move old binaries into legacy folder and add deprecation notice.

--
204267757 by Zhichao Lu:

Fixing a problem in VRD evaluation with missing ground truth annotations for
images that do not contain objects from 62 groundtruth classes.

--
204167430 by Zhichao Lu:

This fixes a flaky losses test failure.

--
203670721 by Zhichao Lu:

Internal change.

--
203569388 by Zhichao Lu:

Internal change

203546580 by Zhichao Lu:

* Expand TPU compatibility g3doc with config snippets
* Change mscoco dataset path in sample configs to the sharded versions

--
203325694 by Zhichao Lu:

Make merge_multiple_label_boxes work for model_main code path.

--
203305655 by Zhichao Lu:

Remove the 1x1 conv layer before pooling in MobileNet-v1-PPN feature extractor.

--
203139608 by Zhichao Lu:

- Support exponential_decay with burnin learning rate schedule.
- Add the minimum learning rate option.
- Make the exponential decay start only after the burnin steps.

--
203068703 by Zhichao Lu:

Modify create_coco_tf_record.py to output sharded files.

--
203025308 by Zhichao Lu:

Add an option to share the prediction tower in WeightSharedBoxPredictor.

--
203024942 by Zhichao Lu:

Move ssd mobilenet v1 ppn configs to third party.

--
202901259 by Zhichao Lu:

Delete obsolete ssd mobilenet v1 focal loss configs and update pets dataset path

--
202894154 by Zhichao Lu:

Move all TPU compatible ssd mobilenet v1 coco14/pet configs to third party.

--
202861774 by Zhichao Lu:

Move Retinanet (SSD + FPN + Shared box predictor) configs to third_party.

PiperOrigin-RevId: 204316992

* Add original files back.

70255908