"...resnet50_tensorflow.git" did not exist on "da8a5778366eae0e79db04cc2a70f7a09afeca6d"
Commit 7a9934df authored by Zhichao Lu's avatar Zhichao Lu Committed by lzc5123016
Browse files

Merged commit includes the following changes:

184048729  by Zhichao Lu:

    Modify target_assigner so that it creates regression targets taking keypoints into account.

--
184027183  by Zhichao Lu:

    Resnet V1 FPN based feature extractors for SSD meta architecture in Object Detection V2 API.

--
184004730  by Zhichao Lu:

    Expose a lever to override the configured mask_type.

--
183933113  by Zhichao Lu:

    Weight shared convolutional box predictor as described in https://arxiv.org/abs/1708.02002

--
183929669  by Zhichao Lu:

    Expanding box list operations for future data augmentations.

--
183916792  by Zhichao Lu:

    Fix unrecognized assertion function in tests.

--
183906851  by Zhichao Lu:

    - Change ssd meta architecture to use regression weights to compute loss normalizer.

--
183871003  by Zhichao Lu:

    Fix config_util_test wrong dependency.

--
183782120  by Zhichao Lu:

    Add __init__ file to third_party directories.

--
183779109  by Zhichao Lu:

    Setup regular version sync.

--
183768772  by Zhichao Lu:

    Make test compatible with numpy 1.12 and higher

--
183767893  by Zhichao Lu:

    Make test compatible with numpy 1.12 and higher

--
183719318  by Zhichao Lu:

    Use the new test interface in ssd feature extractor.

--
183714671  by Zhichao Lu:

    Use the new test_case interface for all anchor generators.

--
183708155  by Zhichao Lu:

    Change variable scopes in ConvolutionalBoxPredictor such that previously trained checkpoints are still compatible after the change in BoxPredictor interface

--
183705798  by Zhichao Lu:

    Internal change.

--
183636023  by Zhichao Lu:

    Fixing argument name for np_box_list_ops.concatenate() function.

--
183490404  by Zhichao Lu:

    Make sure code that relies in SSD older code still works.

--
183426762  by Zhichao Lu:

    Internal change

183412315  by Zhichao Lu:

    Internal change

183337814  by Zhichao Lu:

    Internal change

183303933  by Zhichao Lu:

    Internal change

183257349  by Zhichao Lu:

    Internal change

183254447  by Zhichao Lu:

    Internal change

183251200  by Zhichao Lu:

    Internal change

183135002  by Zhichao Lu:

    Internal change

182851500  by Zhichao Lu:

    Internal change

182839607  by Zhichao Lu:

    Internal change

182830719  by Zhichao Lu:

    Internal change

182533923  by Zhichao Lu:

    Internal change

182391090  by Zhichao Lu:

    Internal change

182262339  by Zhichao Lu:

    Internal change

182244645  by Zhichao Lu:

    Internal change

182241613  by Zhichao Lu:

    Internal change

182133027  by Zhichao Lu:

    Internal change

182058807  by Zhichao Lu:

    Internal change

181812028  by Zhichao Lu:

    Internal change

181788857  by Zhichao Lu:

    Internal change

181656761  by Zhichao Lu:

    Internal change

181541125  by Zhichao Lu:

    Internal change

181538702  by Zhichao Lu:

    Internal change

181125385  by Zhichao Lu:

    Internal change

180957758  by Zhichao Lu:

    Internal change

180941434  by Zhichao Lu:

    Internal change

180852569  by Zhichao Lu:

    Internal change

180846001  by Zhichao Lu:

    Internal change

180832145  by Zhichao Lu:

    Internal change

180740495  by Zhichao Lu:

    Internal change

180729150  by Zhichao Lu:

    Internal change

180589008  by Zhichao Lu:

    Internal change

180585408  by Zhichao Lu:

    Internal change

180581039  by Zhichao Lu:

    Internal change

180286388  by Zhichao Lu:

    Internal change

179934081  by Zhichao Lu:

    Internal change

179841242  by Zhichao Lu:

    Internal change

179831694  by Zhichao Lu:

    Internal change

179761005  by Zhichao Lu:

    Internal change

179610632  by Zhichao Lu:

    Internal change

179605363  by Zhichao Lu:

    Internal change

179603774  by Zhichao Lu:

    Internal change

179598614  by Zhichao Lu:

    Internal change

179597809  by Zhichao Lu:

    Internal change

179494630  by Zhichao Lu:

    Internal change

179367492  by Zhichao Lu:

    Internal change

179250050  by Zhichao Lu:

    Internal change

179247385  by Zhichao Lu:

    Internal change

179207897  by Zhichao Lu:

    Internal change

179076230  by Zhichao Lu:

    Internal change

178862066  by Zhichao Lu:

    Internal change

178854216  by Zhichao Lu:

    Internal change

178853109  by Zhichao Lu:

    Internal change

178709753  by Zhichao Lu:

    Internal change

178640707  by Zhichao Lu:

    Internal change

178421534  by Zhichao Lu:

    Internal change

178287174  by Zhichao Lu:

    Internal change

178257399  by Zhichao Lu:

    Internal change

177681867  by Zhichao Lu:

    Internal change

177654820  by Zhichao Lu:

    Internal change

177654052  by Zhichao Lu:

    Internal change

177638787  by Zhichao Lu:

    Internal change

177598305  by Zhichao Lu:

    Internal change

177538488  by Zhichao Lu:

    Internal change

177474197  by Zhichao Lu:

    Internal change

177271928  by Zhichao Lu:

    Internal change

177250285  by Zhichao Lu:

    Internal change

177210762  by Zhichao Lu:

    Internal change

177197135  by Zhichao Lu:

    Internal change

177037781  by Zhichao Lu:

    Internal change

176917394  by Zhichao Lu:

    Internal change

176683171  by Zhichao Lu:

    Internal change

176450793  by Zhichao Lu:

    Internal change

176388133  by Zhichao Lu:

    Internal change

176197721  by Zhichao Lu:

    Internal change

176195315  by Zhichao Lu:

    Internal change

176128748  by Zhichao Lu:

    Internal change

175743440  by Zhichao Lu:

    Use Toggle instead of bool to make the layout optimizer name and usage consistent with other optimizers.

--
175578178  by Zhichao Lu:

    Internal change

175463518  by Zhichao Lu:

    Internal change

175316616  by Zhichao Lu:

    Internal change

175302470  by Zhichao Lu:

    Internal change

175300323  by Zhichao Lu:

    Internal change

175269680  by Zhichao Lu:

    Internal change

175260574  by Zhichao Lu:

    Internal change

175122281  by Zhichao Lu:

    Internal change

175111708  by Zhichao Lu:

    Internal change

175110183  by Zhichao Lu:

    Internal change

174877166  by Zhichao Lu:

    Internal change

174868399  by Zhichao Lu:

    Internal change

174754200  by Zhichao Lu:

    Internal change

174544534  by Zhichao Lu:

    Internal change

174536143  by Zhichao Lu:

    Internal change

174513795  by Zhichao Lu:

    Internal change

174463713  by Zhichao Lu:

    Internal change

174403525  by Zhichao Lu:

    Internal change

174385170  by Zhichao Lu:

    Internal change

174358498  by Zhichao Lu:

    Internal change

174249903  by Zhichao Lu:

    Fix nasnet image classification and object detection by moving the option to turn ON or OFF batch norm training into it's own arg_scope used only by detection

--
174216508  by Zhichao Lu:

    Internal change

174065370  by Zhichao Lu:

    Internal change

174048035  by Zhichao Lu:

    Fix the pointer for downloading the NAS Faster-RCNN model.

--
174042677  by Zhichao Lu:

    Internal change

173964116  by Zhichao Lu:

    Internal change

173790182  by Zhichao Lu:

    Internal change

173779919  by Zhichao Lu:

    Internal change

173753775  by Zhichao Lu:

    Internal change

173753160  by Zhichao Lu:

    Internal change

173737519  by Zhichao Lu:

    Internal change

173696066  by Zhichao Lu:

    Internal change

173611554  by Zhichao Lu:

    Internal change

173475124  by Zhichao Lu:

    Internal change

173412497  by Zhichao Lu:

    Internal change

173404010  by Zhichao Lu:

    Internal change

173375014  by Zhichao Lu:

    Internal change

173345107  by Zhichao Lu:

    Internal change

173298413  by Zhichao Lu:

    Internal change

173289754  by Zhichao Lu:

    Internal change

173275544  by Zhichao Lu:

    Internal change

173273275  by Zhichao Lu:

    Internal change

173271885  by Zhichao Lu:

    Internal change

173264856  by Zhichao Lu:

    Internal change

173263791  by Zhichao Lu:

    Internal change

173261215  by Zhichao Lu:

    Internal change

173175740  by Zhichao Lu:

    Internal change

173010193  by Zhichao Lu:

    Internal change

172815204  by Zhichao Lu:

    Allow for label maps in tf.Example decoding.

--
172696028  by Zhichao Lu:

    Internal change

172509113  by Zhichao Lu:

    Allow for label maps in tf.Example decoding.

--
172475999  by Zhichao Lu:

    Internal change

172166621  by Zhichao Lu:

    Internal change

172151758  by Zhichao Lu:

    Minor updates to some README files.

    As a result of these friendly issues:
    https://github.com/tensorflow/models/issues/2530
    https://github.com/tensorflow/models/issues/2534

--
172147420  by Zhichao Lu:

    Fix illegal summary name and move from slim's get_or_create_global_step deprecated use of tf.contrib.framework* to tf.train*.

--
172111377  by Zhichao Lu:

    Internal change

172004247  by Zhichao Lu:

    Internal change

171996881  by Zhichao Lu:

    Internal change

171835204  by Zhichao Lu:

    Internal change

171826090  by Zhichao Lu:

    Internal change

171784016  by Zhichao Lu:

    Internal change

171699876  by Zhichao Lu:

    Internal change

171053425  by Zhichao Lu:

    Internal change

170905734  by Zhichao Lu:

    Internal change

170889179  by Zhichao Lu:

    Internal change

170734389  by Zhichao Lu:

    Internal change

170705852  by Zhichao Lu:

    Internal change

170401574  by Zhichao Lu:

    Internal change

170352571  by Zhichao Lu:

    Internal change

170215443  by Zhichao Lu:

    Internal change

170184288  by Zhichao Lu:

    Internal change

169936898  by Zhichao Lu:

    Internal change

169763373  by Zhichao Lu:

    Fix broken GitHub links in tensorflow and tensorflow_models resulting from The Great Models Move (a.k.a. the research subfolder)

--
169744825  by Zhichao Lu:

    Internal change

169638135  by Zhichao Lu:

    Internal change

169561814  by Zhichao Lu:

    Internal change

169444091  by Zhichao Lu:

    Internal change

169292330  by Zhichao Lu:

    Internal change

169145185  by Zhichao Lu:

    Internal change

168906035  by Zhichao Lu:

    Internal change

168790411  by Zhichao Lu:

    Internal change

168708911  by Zhichao Lu:

    Internal change

168611969  by Zhichao Lu:

    Internal change

168535975  by Zhichao Lu:

    Internal change

168381815  by Zhichao Lu:

    Internal change

168244740  by Zhichao Lu:

    Internal change

168240024  by Zhichao Lu:

    Internal change

168168016  by Zhichao Lu:

    Internal change

168071571  by Zhichao Lu:

    Move display strings to below the bounding box if they would otherwise be outside the image.

--
168067771  by Zhichao Lu:

    Internal change

167970950  by Zhichao Lu:

    Internal change

167884533  by Zhichao Lu:

    Internal change

167626173  by Zhichao Lu:

    Internal change

167277422  by Zhichao Lu:

    Internal change

167249393  by Zhichao Lu:

    Internal change

167248954  by Zhichao Lu:

    Internal change

167189395  by Zhichao Lu:

    Internal change

167107797  by Zhichao Lu:

    Internal change

167061250  by Zhichao Lu:

    Internal change

166871147  by Zhichao Lu:

    Internal change

166867617  by Zhichao Lu:

    Internal change

166862112  by Zhichao Lu:

    Internal change

166715648  by Zhichao Lu:

    Internal change

166635615  by Zhichao Lu:

    Internal change

166383182  by Zhichao Lu:

    Internal change

166371326  by Zhichao Lu:

    Internal change

166254711  by Zhichao Lu:

    Internal change

166106294  by Zhichao Lu:

    Internal change

166081204  by Zhichao Lu:

    Internal change

165972262  by Zhichao Lu:

    Internal change

165816702  by Zhichao Lu:

    Internal change

165764471  by Zhichao Lu:

    Internal change

165724134  by Zhichao Lu:

    Internal change

165655829  by Zhichao Lu:

    Internal change

165587904  by Zhichao Lu:

    Internal change

165534540  by Zhichao Lu:

    Internal change

165177692  by Zhichao Lu:

    Internal change

165091822  by Zhichao Lu:

    Internal change

165019730  by Zhichao Lu:

    Internal change

165002942  by Zhichao Lu:

    Internal change

164897728  by Zhichao Lu:

    Internal change

164782618  by Zhichao Lu:

    Internal change

164710379  by Zhichao Lu:

    Internal change

164639237  by Zhichao Lu:

    Internal change

164069251  by Zhichao Lu:

    Internal change

164058169  by Zhichao Lu:

    Internal change

163913796  by Zhichao Lu:

    Internal change

163756696  by Zhichao Lu:

    Internal change

163524665  by Zhichao Lu:

    Internal change

163393399  by Zhichao Lu:

    Internal change

163385733  by Zhichao Lu:

    Internal change

162525065  by Zhichao Lu:

    Internal change

162376984  by Zhichao Lu:

    Internal change

162026661  by Zhichao Lu:

    Internal change

161956004  by Zhichao Lu:

    Internal change

161817520  by Zhichao Lu:

    Internal change

161718688  by Zhichao Lu:

    Internal change

161624398  by Zhichao Lu:

    Internal change

161575120  by Zhichao Lu:

    Internal change

161483997  by Zhichao Lu:

    Internal change

161462189  by Zhichao Lu:

    Internal change

161452968  by Zhichao Lu:

    Internal change

161443992  by Zhichao Lu:

    Internal change

161408607  by Zhichao Lu:

    Internal change

161262084  by Zhichao Lu:

    Internal change

161214023  by Zhichao Lu:

    Internal change

161025667  by Zhichao Lu:

    Internal change

160982216  by Zhichao Lu:

    Internal change

160666760  by Zhichao Lu:

    Internal change

160570489  by Zhichao Lu:

    Internal change

160553112  by Zhichao Lu:

    Internal change

160458261  by Zhichao Lu:

    Internal change

160349302  by Zhichao Lu:

    Internal change

160296092  by Zhichao Lu:

    Internal change

160287348  by Zhichao Lu:

    Internal change

160199279  by Zhichao Lu:

    Internal change

160160156  by Zhichao Lu:

    Internal change

160151954  by Zhichao Lu:

    Internal change

160005404  by Zhichao Lu:

    Internal change

159983265  by Zhichao Lu:

    Internal change

159819896  by Zhichao Lu:

    Internal change

159749419  by Zhichao Lu:

    Internal change

159596448  by Zhichao Lu:

    Internal change

159587801  by Zhichao Lu:

    Internal change

159587342  by Zhichao Lu:

    Internal change

159476256  by Zhichao Lu:

    Internal change

159463992  by Zhichao Lu:

    Internal change

159455585  by Zhichao Lu:

    Internal change

159270798  by Zhichao Lu:

    Internal change

159256633  by Zhichao Lu:

    Internal change

159141989  by Zhichao Lu:

    Internal change

159079098  by Zhichao Lu:

    Internal change

159078559  by Zhichao Lu:

    Internal change

159077055  by Zhichao Lu:

    Internal change

159072046  by Zhichao Lu:

    Internal change

159071092  by Zhichao Lu:

    Internal change

159069262  by Zhichao Lu:

    Internal change

159037430  by Zhichao Lu:

    Internal change

159035747  by Zhichao Lu:

    Internal change

159023868  by Zhichao Lu:

    Internal change

158939092  by Zhichao Lu:

    Internal change

158912561  by Zhichao Lu:

    Internal change

158903825  by Zhichao Lu:

    Internal change

158894348  by Zhichao Lu:

    Internal change

158884934  by Zhichao Lu:

    Internal change

158878010  by Zhichao Lu:

    Internal change

158874620  by Zhichao Lu:

    Internal change

158869501  by Zhichao Lu:

    Internal change

158842623  by Zhichao Lu:

    Internal change

158801298  by Zhichao Lu:

    Internal change

158775487  by Zhichao Lu:

    Internal change

158773668  by Zhichao Lu:

    Internal change

158771394  by Zhichao Lu:

    Internal change

158668928  by Zhichao Lu:

    Internal change

158596865  by Zhichao Lu:

    Internal change

158587317  by Zhichao Lu:

    Internal change

158586348  by Zhichao Lu:

    Internal change

158585707  by Zhichao Lu:

    Internal change

158577134  by Zhichao Lu:

    Internal change

158459749  by Zhichao Lu:

    Internal change

158459678  by Zhichao Lu:

    Internal change

158328972  by Zhichao Lu:

    Internal change

158324255  by Zhichao Lu:

    Internal change

158319576  by Zhichao Lu:

    Internal change

158290802  by Zhichao Lu:

    Internal change

158273041  by Zhichao Lu:

    Internal change

158240477  by Zhichao Lu:

    Internal change

158204316  by Zhichao Lu:

    Internal change

158154161  by Zhichao Lu:

    Internal change

158077203  by Zhichao Lu:

    Internal change

158041397  by Zhichao Lu:

    Internal change

158029233  by Zhichao Lu:

    Internal change

157976306  by Zhichao Lu:

    Internal change

157966896  by Zhichao Lu:

    Internal change

157945642  by Zhichao Lu:

    Internal change

157943135  by Zhichao Lu:

    Internal change

157942158  by Zhichao Lu:

    Internal change

157897866  by Zhichao Lu:

    Internal change

157866667  by Zhichao Lu:

    Internal change

157845915  by Zhichao Lu:

    Internal change

157842592  by Zhichao Lu:

    Internal change

157832761  by Zhichao Lu:

    Internal change

157824451  by Zhichao Lu:

    Internal change

157816531  by Zhichao Lu:

    Internal change

157782130  by Zhichao Lu:

    Internal change

157733752  by Zhichao Lu:

    Internal change

157654577  by Zhichao Lu:

    Internal change

157639285  by Zhichao Lu:

    Internal change

157530694  by Zhichao Lu:

    Internal change

157518469  by Zhichao Lu:

    Internal change

157514626  by Zhichao Lu:

    Internal change

157481413  by Zhichao Lu:

    Internal change

157267863  by Zhichao Lu:

    Internal change

157263616  by Zhichao Lu:

    Internal change

157234554  by Zhichao Lu:

    Internal change

157174595  by Zhichao Lu:

    Internal change

157169681  by Zhichao Lu:

    Internal change

157156425  by Zhichao Lu:

    Internal change

157024436  by Zhichao Lu:

    Internal change

157016195  by Zhichao Lu:

    Internal change

156941658  by Zhichao Lu:

    Internal change

156880859  by Zhichao Lu:

    Internal change

156790636  by Zhichao Lu:

    Internal change

156565969  by Zhichao Lu:

    Internal change

156522345  by Zhichao Lu:

    Internal change

156518570  by Zhichao Lu:

    Internal change

156509878  by Zhichao Lu:

    Internal change

156509134  by Zhichao Lu:

    Internal change

156472497  by Zhichao Lu:

    Internal change

156471429  by Zhichao Lu:

    Internal change

156470865  by Zhichao Lu:

    Internal change

156461563  by Zhichao Lu:

    Internal change

156437521  by Zhichao Lu:

    Internal change

156334994  by Zhichao Lu:

    Internal change

156319604  by Zhichao Lu:

    Internal change

156234305  by Zhichao Lu:

    Internal change

156226207  by Zhichao Lu:

    Internal change

156215347  by Zhichao Lu:

    Internal change

156127227  by Zhichao Lu:

    Internal change

156120405  by Zhichao Lu:

    Internal change

156113752  by Zhichao Lu:

    Internal change

156098936  by Zhichao Lu:

    Internal change

155924066  by Zhichao Lu:

    Internal change

155883241  by Zhichao Lu:

    Internal change

155806887  by Zhichao Lu:

    Internal change

155641849  by Zhichao Lu:

    Internal change

155593034  by Zhichao Lu:

    Internal change

155570702  by Zhichao Lu:

    Internal change

155515306  by Zhichao Lu:

    Internal change

155514787  by Zhichao Lu:

    Internal change

155445237  by Zhichao Lu:

    Internal change

155438672  by Zhichao Lu:

    Internal change

155264448  by Zhichao Lu:

    Internal change

155222148  by Zhichao Lu:

    Internal change

155106590  by Zhichao Lu:

    Internal change

155090562  by Zhichao Lu:

    Internal change

154973775  by Zhichao Lu:

    Internal change

154972880  by Zhichao Lu:

    Internal change

154871596  by Zhichao Lu:

    Internal change

154835007  by Zhichao Lu:

    Internal change

154788175  by Zhichao Lu:

    Internal change

154731169  by Zhichao Lu:

    Internal change

154721261  by Zhichao Lu:

    Internal change

154594626  by Zhichao Lu:

    Internal change

154588305  by Zhichao Lu:

    Internal change

154578994  by Zhichao Lu:

    Internal change

154571515  by Zhichao Lu:

    Internal change

154552873  by Zhichao Lu:

    Internal change

154549672  by Zhichao Lu:

    Internal change

154463631  by Zhichao Lu:

    Internal change

154437690  by Zhichao Lu:

    Internal change

154412359  by Zhichao Lu:

    Internal change

154374026  by Zhichao Lu:

    Internal change

154361648  by Zhichao Lu:

    Internal change

154310164  by Zhichao Lu:

    Internal change

154220862  by Zhichao Lu:

    Internal change

154187281  by Zhichao Lu:

    Internal change

154186651  by Zhichao Lu:

    Internal change

154119783  by Zhichao Lu:

    Internal change

154114285  by Zhichao Lu:

    Internal change

154095717  by Zhichao Lu:

    Internal change

154057972  by Zhichao Lu:

    Internal change

154055285  by Zhichao Lu:

    Internal change

153659288  by Zhichao Lu:

    Internal change

153637797  by Zhichao Lu:

    Internal change

153561771  by Zhichao Lu:

    Internal change

153540765  by Zhichao Lu:

    Internal change

153496128  by Zhichao Lu:

    Internal change

153473323  by Zhichao Lu:

    Internal change

153368812  by Zhichao Lu:

    Internal change

153367292  by Zhichao Lu:

    Internal change

153201890  by Zhichao Lu:

    Internal change

153074177  by Zhichao Lu:

    Internal change

152980017  by Zhichao Lu:

    Internal change

152978434  by Zhichao Lu:

    Internal change

152951821  by Zhichao Lu:

    Internal change

152904076  by Zhichao Lu:

    Internal change

152883703  by Zhichao Lu:

    Internal change

152869747  by Zhichao Lu:

    Internal change

152827463  by Zhichao Lu:

    Internal change

152756886  by Zhichao Lu:

    Internal change

152752840  by Zhichao Lu:

    Internal change

152736347  by Zhichao Lu:

    Internal change

152728184  by Zhichao Lu:

    Internal change

152720120  by Zhichao Lu:

    Internal change

152710964  by Zhichao Lu:

    Internal change

152706735  by Zhichao Lu:

    Internal change

152681133  by Zhichao Lu:

    Internal change

152517758  by Zhichao Lu:

    Internal change

152516381  by Zhichao Lu:

    Internal change

152511258  by Zhichao Lu:

    Internal change

152319164  by Zhichao Lu:

    Internal change

152316404  by Zhichao Lu:

    Internal change

152309261  by Zhichao Lu:

    Internal change

152308007  by Zhichao Lu:

    Internal change

152296551  by Zhichao Lu:

    Internal change

152188069  by Zhichao Lu:

    Internal change

152158644  by Zhichao Lu:

    Internal change

152153578  by Zhichao Lu:

    Internal change

152152285  by Zhichao Lu:

    Internal change

152055035  by Zhichao Lu:

    Internal change

152036778  by Zhichao Lu:

    Internal change

152020728  by Zhichao Lu:

    Internal change

152014842  by Zhichao Lu:

    Internal change

151848225  by Zhichao Lu:

    Internal change

151741308  by Zhichao Lu:

    Internal change

151740499  by Zhichao Lu:

    Internal change

151736189  by Zhichao Lu:

    Internal change

151612892  by Zhichao Lu:

    Internal change

151599502  by Zhichao Lu:

    Internal change

151538547  by Zhichao Lu:

    Internal change

151496530  by Zhichao Lu:

    Internal change

151476070  by Zhichao Lu:

    Internal change

151448662  by Zhichao Lu:

    Internal change

151411627  by Zhichao Lu:

    Internal change

151397737  by Zhichao Lu:

    Internal change

151169523  by Zhichao Lu:

    Internal change

151148956  by Zhichao Lu:

    Internal change

150944227  by Zhichao Lu:

    Internal change

150276683  by Zhichao Lu:

    Internal change

149986687  by Zhichao Lu:

    Internal change

149218749  by Zhichao Lu:

    Internal change

PiperOrigin-RevId: 184048729
parent 7ef602be
...@@ -80,7 +80,6 @@ class LocalizationLossBuilderTest(tf.test.TestCase): ...@@ -80,7 +80,6 @@ class LocalizationLossBuilderTest(tf.test.TestCase):
losses_text_proto = """ losses_text_proto = """
localization_loss { localization_loss {
weighted_smooth_l1 { weighted_smooth_l1 {
anchorwise_output: true
} }
} }
classification_loss { classification_loss {
...@@ -245,7 +244,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase): ...@@ -245,7 +244,7 @@ class ClassificationLossBuilderTest(tf.test.TestCase):
targets = tf.constant([[[0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]]) targets = tf.constant([[[0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]])
weights = tf.constant([[1.0, 1.0]]) weights = tf.constant([[1.0, 1.0]])
loss = classification_loss(predictions, targets, weights=weights) loss = classification_loss(predictions, targets, weights=weights)
self.assertEqual(loss.shape, [1, 2]) self.assertEqual(loss.shape, [1, 2, 3])
def test_raise_error_on_empty_config(self): def test_raise_error_on_empty_config(self):
losses_text_proto = """ losses_text_proto = """
......
...@@ -106,6 +106,7 @@ def _build_ssd_feature_extractor(feature_extractor_config, is_training, ...@@ -106,6 +106,7 @@ def _build_ssd_feature_extractor(feature_extractor_config, is_training,
min_depth = feature_extractor_config.min_depth min_depth = feature_extractor_config.min_depth
pad_to_multiple = feature_extractor_config.pad_to_multiple pad_to_multiple = feature_extractor_config.pad_to_multiple
batch_norm_trainable = feature_extractor_config.batch_norm_trainable batch_norm_trainable = feature_extractor_config.batch_norm_trainable
use_explicit_padding = feature_extractor_config.use_explicit_padding
conv_hyperparams = hyperparams_builder.build( conv_hyperparams = hyperparams_builder.build(
feature_extractor_config.conv_hyperparams, is_training) feature_extractor_config.conv_hyperparams, is_training)
...@@ -115,7 +116,8 @@ def _build_ssd_feature_extractor(feature_extractor_config, is_training, ...@@ -115,7 +116,8 @@ def _build_ssd_feature_extractor(feature_extractor_config, is_training,
feature_extractor_class = SSD_FEATURE_EXTRACTOR_CLASS_MAP[feature_type] feature_extractor_class = SSD_FEATURE_EXTRACTOR_CLASS_MAP[feature_type]
return feature_extractor_class(is_training, depth_multiplier, min_depth, return feature_extractor_class(is_training, depth_multiplier, min_depth,
pad_to_multiple, conv_hyperparams, pad_to_multiple, conv_hyperparams,
batch_norm_trainable, reuse_weights) batch_norm_trainable, reuse_weights,
use_explicit_padding)
def _build_ssd_model(ssd_config, is_training): def _build_ssd_model(ssd_config, is_training):
...@@ -228,7 +230,7 @@ def _build_faster_rcnn_model(frcnn_config, is_training): ...@@ -228,7 +230,7 @@ def _build_faster_rcnn_model(frcnn_config, is_training):
feature_extractor = _build_faster_rcnn_feature_extractor( feature_extractor = _build_faster_rcnn_feature_extractor(
frcnn_config.feature_extractor, is_training) frcnn_config.feature_extractor, is_training)
first_stage_only = frcnn_config.first_stage_only number_of_stages = frcnn_config.number_of_stages
first_stage_anchor_generator = anchor_generator_builder.build( first_stage_anchor_generator = anchor_generator_builder.build(
frcnn_config.first_stage_anchor_generator) frcnn_config.first_stage_anchor_generator)
...@@ -283,7 +285,7 @@ def _build_faster_rcnn_model(frcnn_config, is_training): ...@@ -283,7 +285,7 @@ def _build_faster_rcnn_model(frcnn_config, is_training):
'num_classes': num_classes, 'num_classes': num_classes,
'image_resizer_fn': image_resizer_fn, 'image_resizer_fn': image_resizer_fn,
'feature_extractor': feature_extractor, 'feature_extractor': feature_extractor,
'first_stage_only': first_stage_only, 'number_of_stages': number_of_stages,
'first_stage_anchor_generator': first_stage_anchor_generator, 'first_stage_anchor_generator': first_stage_anchor_generator,
'first_stage_atrous_rate': first_stage_atrous_rate, 'first_stage_atrous_rate': first_stage_atrous_rate,
'first_stage_box_predictor_arg_scope': 'first_stage_box_predictor_arg_scope':
......
...@@ -196,7 +196,7 @@ class OptimizerBuilderTest(tf.test.TestCase): ...@@ -196,7 +196,7 @@ class OptimizerBuilderTest(tf.test.TestCase):
optimizer = optimizer_builder.build(optimizer_proto, global_summaries) optimizer = optimizer_builder.build(optimizer_proto, global_summaries)
self.assertTrue( self.assertTrue(
isinstance(optimizer, tf.contrib.opt.MovingAverageOptimizer)) isinstance(optimizer, tf.contrib.opt.MovingAverageOptimizer))
# TODO(rathodv): Find a way to not depend on the private members. # TODO: Find a way to not depend on the private members.
self.assertAlmostEqual(optimizer._ema._decay, 0.2) self.assertAlmostEqual(optimizer._ema._decay, 0.2)
def testBuildEmptyOptimizer(self): def testBuildEmptyOptimizer(self):
......
...@@ -53,7 +53,7 @@ py_library( ...@@ -53,7 +53,7 @@ py_library(
deps = [ deps = [
":box_list", ":box_list",
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/utils:shape_utils", "//tensorflow/models/research/object_detection/utils:shape_utils",
], ],
) )
...@@ -113,7 +113,7 @@ py_library( ...@@ -113,7 +113,7 @@ py_library(
":box_list", ":box_list",
":box_list_ops", ":box_list_ops",
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/utils:ops", "//tensorflow/models/research/object_detection/utils:ops",
], ],
) )
...@@ -162,6 +162,7 @@ py_library( ...@@ -162,6 +162,7 @@ py_library(
":keypoint_ops", ":keypoint_ops",
":standard_fields", ":standard_fields",
"//tensorflow", "//tensorflow",
"//tensorflow/models/research/object_detection/utils:shape_utils",
], ],
) )
...@@ -211,6 +212,7 @@ py_library( ...@@ -211,6 +212,7 @@ py_library(
":box_list_ops", ":box_list_ops",
":standard_fields", ":standard_fields",
"//tensorflow", "//tensorflow",
"//tensorflow/models/research/object_detection/utils:shape_utils",
], ],
) )
...@@ -232,15 +234,16 @@ py_library( ...@@ -232,15 +234,16 @@ py_library(
], ],
deps = [ deps = [
":box_list", ":box_list",
":box_list_ops",
":matcher", ":matcher",
":region_similarity_calculator", ":region_similarity_calculator",
":standard_fields",
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/box_coders:faster_rcnn_box_coder", "//tensorflow/models/research/object_detection/box_coders:faster_rcnn_box_coder",
"//tensorflow_models/object_detection/box_coders:mean_stddev_box_coder", "//tensorflow/models/research/object_detection/box_coders:mean_stddev_box_coder",
"//tensorflow_models/object_detection/core:box_coder", "//tensorflow/models/research/object_detection/core:box_coder",
"//tensorflow_models/object_detection/matchers:argmax_matcher", "//tensorflow/models/research/object_detection/matchers:argmax_matcher",
"//tensorflow_models/object_detection/matchers:bipartite_matcher", "//tensorflow/models/research/object_detection/matchers:bipartite_matcher",
"//tensorflow/models/research/object_detection/utils:shape_utils",
], ],
) )
...@@ -254,8 +257,10 @@ py_test( ...@@ -254,8 +257,10 @@ py_test(
":region_similarity_calculator", ":region_similarity_calculator",
":target_assigner", ":target_assigner",
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/box_coders:mean_stddev_box_coder", "//tensorflow/models/research/object_detection/box_coders:keypoint_box_coder",
"//tensorflow_models/object_detection/matchers:bipartite_matcher", "//tensorflow/models/research/object_detection/box_coders:mean_stddev_box_coder",
"//tensorflow/models/research/object_detection/matchers:bipartite_matcher",
"//tensorflow/models/research/object_detection/utils:test_case",
], ],
) )
...@@ -274,9 +279,9 @@ py_library( ...@@ -274,9 +279,9 @@ py_library(
srcs = ["box_predictor.py"], srcs = ["box_predictor.py"],
deps = [ deps = [
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/utils:ops", "//tensorflow/models/research/object_detection/utils:ops",
"//tensorflow_models/object_detection/utils:shape_utils", "//tensorflow/models/research/object_detection/utils:shape_utils",
"//tensorflow_models/object_detection/utils:static_shape", "//tensorflow/models/research/object_detection/utils:static_shape",
], ],
) )
...@@ -286,8 +291,9 @@ py_test( ...@@ -286,8 +291,9 @@ py_test(
deps = [ deps = [
":box_predictor", ":box_predictor",
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/builders:hyperparams_builder", "//tensorflow/models/research/object_detection/builders:hyperparams_builder",
"//tensorflow_models/object_detection/protos:hyperparams_py_pb2", "//tensorflow/models/research/object_detection/protos:hyperparams_py_pb2",
"//tensorflow/models/research/object_detection/utils:test_case",
], ],
) )
...@@ -298,7 +304,7 @@ py_library( ...@@ -298,7 +304,7 @@ py_library(
], ],
deps = [ deps = [
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/core:box_list_ops", "//tensorflow/models/research/object_detection/core:box_list_ops",
], ],
) )
...@@ -309,7 +315,7 @@ py_test( ...@@ -309,7 +315,7 @@ py_test(
], ],
deps = [ deps = [
":region_similarity_calculator", ":region_similarity_calculator",
"//tensorflow_models/object_detection/core:box_list", "//tensorflow/models/research/object_detection/core:box_list",
], ],
) )
...@@ -330,7 +336,7 @@ py_library( ...@@ -330,7 +336,7 @@ py_library(
], ],
deps = [ deps = [
"//tensorflow", "//tensorflow",
"//tensorflow_models/object_detection/utils:ops", "//tensorflow/models/research/object_detection/utils:ops",
], ],
) )
......
...@@ -77,8 +77,8 @@ class AnchorGenerator(object): ...@@ -77,8 +77,8 @@ class AnchorGenerator(object):
def generate(self, feature_map_shape_list, **params): def generate(self, feature_map_shape_list, **params):
"""Generates a collection of bounding boxes to be used as anchors. """Generates a collection of bounding boxes to be used as anchors.
TODO: remove **params from argument list and make stride and offsets (for TODO: remove **params from argument list and make stride and
multiple_grid_anchor_generator) constructor arguments. offsets (for multiple_grid_anchor_generator) constructor arguments.
Args: Args:
feature_map_shape_list: list of (height, width) pairs in the format feature_map_shape_list: list of (height, width) pairs in the format
...@@ -140,3 +140,4 @@ class AnchorGenerator(object): ...@@ -140,3 +140,4 @@ class AnchorGenerator(object):
* feature_map_shape[0] * feature_map_shape[0]
* feature_map_shape[1]) * feature_map_shape[1])
return tf.assert_equal(expected_num_anchors, anchors.num_boxes()) return tf.assert_equal(expected_num_anchors, anchors.num_boxes())
...@@ -183,7 +183,8 @@ def prune_completely_outside_window(boxlist, window, scope=None): ...@@ -183,7 +183,8 @@ def prune_completely_outside_window(boxlist, window, scope=None):
scope: name scope. scope: name scope.
Returns: Returns:
pruned_corners: a tensor with shape [M_out, 4] where M_out <= M_in pruned_boxlist: a new BoxList with all bounding boxes partially or fully in
the window.
valid_indices: a tensor with shape [M_out] indexing the valid bounding boxes valid_indices: a tensor with shape [M_out] indexing the valid bounding boxes
in the input tensor. in the input tensor.
""" """
...@@ -656,7 +657,7 @@ def filter_greater_than(boxlist, thresh, scope=None): ...@@ -656,7 +657,7 @@ def filter_greater_than(boxlist, thresh, scope=None):
This op keeps the collection of boxes whose corresponding scores are This op keeps the collection of boxes whose corresponding scores are
greater than the input threshold. greater than the input threshold.
TODO: Change function name to filter_scores_greater_than TODO: Change function name to FilterScoresGreaterThan
Args: Args:
boxlist: BoxList holding N boxes. Must contain a 'scores' field boxlist: BoxList holding N boxes. Must contain a 'scores' field
...@@ -982,3 +983,79 @@ def pad_or_clip_box_list(boxlist, num_boxes, scope=None): ...@@ -982,3 +983,79 @@ def pad_or_clip_box_list(boxlist, num_boxes, scope=None):
boxlist.get_field(field), num_boxes) boxlist.get_field(field), num_boxes)
subboxlist.add_field(field, subfield) subboxlist.add_field(field, subfield)
return subboxlist return subboxlist
def select_random_box(boxlist,
default_box=None,
seed=None,
scope=None):
"""Selects a random bounding box from a `BoxList`.
Args:
boxlist: A BoxList.
default_box: A [1, 4] float32 tensor. If no boxes are present in `boxlist`,
this default box will be returned. If None, will use a default box of
[[-1., -1., -1., -1.]].
seed: Random seed.
scope: Name scope.
Returns:
bbox: A [1, 4] tensor with a random bounding box.
valid: A bool tensor indicating whether a valid bounding box is returned
(True) or whether the default box is returned (False).
"""
with tf.name_scope(scope, 'SelectRandomBox'):
bboxes = boxlist.get()
combined_shape = shape_utils.combined_static_and_dynamic_shape(bboxes)
number_of_boxes = combined_shape[0]
default_box = default_box or tf.constant([[-1., -1., -1., -1.]])
def select_box():
random_index = tf.random_uniform([],
maxval=number_of_boxes,
dtype=tf.int32,
seed=seed)
return tf.expand_dims(bboxes[random_index], axis=0), tf.constant(True)
return tf.cond(
tf.greater_equal(number_of_boxes, 1),
true_fn=select_box,
false_fn=lambda: (default_box, tf.constant(False)))
def get_minimal_coverage_box(boxlist,
default_box=None,
scope=None):
"""Creates a single bounding box which covers all boxes in the boxlist.
Args:
boxlist: A Boxlist.
default_box: A [1, 4] float32 tensor. If no boxes are present in `boxlist`,
this default box will be returned. If None, will use a default box of
[[0., 0., 1., 1.]].
scope: Name scope.
Returns:
A [1, 4] float32 tensor with a bounding box that tightly covers all the
boxes in the box list. If the boxlist does not contain any boxes, the
default box is returned.
"""
with tf.name_scope(scope, 'CreateCoverageBox'):
num_boxes = boxlist.num_boxes()
def coverage_box(bboxes):
y_min, x_min, y_max, x_max = tf.split(
value=bboxes, num_or_size_splits=4, axis=1)
y_min_coverage = tf.reduce_min(y_min, axis=0)
x_min_coverage = tf.reduce_min(x_min, axis=0)
y_max_coverage = tf.reduce_max(y_max, axis=0)
x_max_coverage = tf.reduce_max(x_max, axis=0)
return tf.stack(
[y_min_coverage, x_min_coverage, y_max_coverage, x_max_coverage],
axis=1)
default_box = default_box or tf.constant([[0., 0., 1., 1.]])
return tf.cond(
tf.greater_equal(num_boxes, 1),
true_fn=lambda: coverage_box(boxlist.get()),
false_fn=lambda: default_box)
...@@ -153,6 +153,25 @@ class BoxListOpsTest(tf.test.TestCase): ...@@ -153,6 +153,25 @@ class BoxListOpsTest(tf.test.TestCase):
extra_data_out = sess.run(pruned.get_field('extra_data')) extra_data_out = sess.run(pruned.get_field('extra_data'))
self.assertAllEqual(extra_data_out, [[1], [2], [3], [4], [6]]) self.assertAllEqual(extra_data_out, [[1], [2], [3], [4], [6]])
def test_prune_completely_outside_window_with_empty_boxlist(self):
window = tf.constant([0, 0, 9, 14], tf.float32)
corners = tf.zeros(shape=[0, 4], dtype=tf.float32)
boxes = box_list.BoxList(corners)
boxes.add_field('extra_data', tf.zeros(shape=[0], dtype=tf.int32))
pruned, keep_indices = box_list_ops.prune_completely_outside_window(boxes,
window)
pruned_boxes = pruned.get()
extra = pruned.get_field('extra_data')
exp_pruned_boxes = np.zeros(shape=[0, 4], dtype=np.float32)
exp_extra = np.zeros(shape=[0], dtype=np.int32)
with self.test_session() as sess:
pruned_boxes_out, keep_indices_out, extra_out = sess.run(
[pruned_boxes, keep_indices, extra])
self.assertAllClose(exp_pruned_boxes, pruned_boxes_out)
self.assertAllEqual([], keep_indices_out)
self.assertAllEqual(exp_extra, extra_out)
def test_intersection(self): def test_intersection(self):
corners1 = tf.constant([[4.0, 3.0, 7.0, 5.0], [5.0, 6.0, 10.0, 7.0]]) corners1 = tf.constant([[4.0, 3.0, 7.0, 5.0], [5.0, 6.0, 10.0, 7.0]])
corners2 = tf.constant([[3.0, 4.0, 6.0, 8.0], [14.0, 14.0, 15.0, 15.0], corners2 = tf.constant([[3.0, 4.0, 6.0, 8.0], [14.0, 14.0, 15.0, 15.0],
...@@ -593,6 +612,58 @@ class BoxListOpsTest(tf.test.TestCase): ...@@ -593,6 +612,58 @@ class BoxListOpsTest(tf.test.TestCase):
self.assertAllEqual(expected_classes, classes_out) self.assertAllEqual(expected_classes, classes_out)
self.assertAllClose(expected_scores, scores_out) self.assertAllClose(expected_scores, scores_out)
def test_select_random_box(self):
boxes = [[0., 0., 1., 1.],
[0., 1., 2., 3.],
[0., 2., 3., 4.]]
corners = tf.constant(boxes, dtype=tf.float32)
boxlist = box_list.BoxList(corners)
random_bbox, valid = box_list_ops.select_random_box(boxlist)
with self.test_session() as sess:
random_bbox_out, valid_out = sess.run([random_bbox, valid])
norm_small = any(
[np.linalg.norm(random_bbox_out - box) < 1e-6 for box in boxes])
self.assertTrue(norm_small)
self.assertTrue(valid_out)
def test_select_random_box_with_empty_boxlist(self):
corners = tf.constant([], shape=[0, 4], dtype=tf.float32)
boxlist = box_list.BoxList(corners)
random_bbox, valid = box_list_ops.select_random_box(boxlist)
with self.test_session() as sess:
random_bbox_out, valid_out = sess.run([random_bbox, valid])
expected_bbox_out = np.array([[-1., -1., -1., -1.]], dtype=np.float32)
self.assertAllEqual(expected_bbox_out, random_bbox_out)
self.assertFalse(valid_out)
def test_get_minimal_coverage_box(self):
boxes = [[0., 0., 1., 1.],
[-1., 1., 2., 3.],
[0., 2., 3., 4.]]
expected_coverage_box = [[-1., 0., 3., 4.]]
corners = tf.constant(boxes, dtype=tf.float32)
boxlist = box_list.BoxList(corners)
coverage_box = box_list_ops.get_minimal_coverage_box(boxlist)
with self.test_session() as sess:
coverage_box_out = sess.run(coverage_box)
self.assertAllClose(expected_coverage_box, coverage_box_out)
def test_get_minimal_coverage_box_with_empty_boxlist(self):
corners = tf.constant([], shape=[0, 4], dtype=tf.float32)
boxlist = box_list.BoxList(corners)
coverage_box = box_list_ops.get_minimal_coverage_box(boxlist)
with self.test_session() as sess:
coverage_box_out = sess.run(coverage_box)
self.assertAllClose([[0.0, 0.0, 1.0, 1.0]], coverage_box_out)
class ConcatenateTest(tf.test.TestCase): class ConcatenateTest(tf.test.TestCase):
...@@ -958,5 +1029,6 @@ class BoxRefinementTest(tf.test.TestCase): ...@@ -958,5 +1029,6 @@ class BoxRefinementTest(tf.test.TestCase):
self.assertAllClose(expected_scores, scores_out) self.assertAllClose(expected_scores, scores_out)
self.assertAllEqual(extra_field_out, [0, 1, 1]) self.assertAllEqual(extra_field_out, [0, 1, 1])
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -14,7 +14,6 @@ ...@@ -14,7 +14,6 @@
# ============================================================================== # ==============================================================================
"""Tests for object_detection.core.box_predictor.""" """Tests for object_detection.core.box_predictor."""
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
...@@ -22,6 +21,7 @@ from google.protobuf import text_format ...@@ -22,6 +21,7 @@ from google.protobuf import text_format
from object_detection.builders import hyperparams_builder from object_detection.builders import hyperparams_builder
from object_detection.core import box_predictor from object_detection.core import box_predictor
from object_detection.protos import hyperparams_pb2 from object_detection.protos import hyperparams_pb2
from object_detection.utils import test_case
class MaskRCNNBoxPredictorTest(tf.test.TestCase): class MaskRCNNBoxPredictorTest(tf.test.TestCase):
...@@ -55,7 +55,8 @@ class MaskRCNNBoxPredictorTest(tf.test.TestCase): ...@@ -55,7 +55,8 @@ class MaskRCNNBoxPredictorTest(tf.test.TestCase):
box_code_size=4, box_code_size=4,
) )
box_predictions = mask_box_predictor.predict( box_predictions = mask_box_predictor.predict(
image_features, num_predictions_per_location=1, scope='BoxPredictor') [image_features], num_predictions_per_location=[1],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS] box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
class_predictions_with_background = box_predictions[ class_predictions_with_background = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND] box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
...@@ -93,12 +94,16 @@ class MaskRCNNBoxPredictorTest(tf.test.TestCase): ...@@ -93,12 +94,16 @@ class MaskRCNNBoxPredictorTest(tf.test.TestCase):
op_type=hyperparams_pb2.Hyperparams.CONV), op_type=hyperparams_pb2.Hyperparams.CONV),
predict_instance_masks=True) predict_instance_masks=True)
box_predictions = mask_box_predictor.predict( box_predictions = mask_box_predictor.predict(
image_features, num_predictions_per_location=1, scope='BoxPredictor') [image_features],
num_predictions_per_location=[1],
scope='BoxPredictor',
predict_boxes_and_classes=True,
predict_auxiliary_outputs=True)
mask_predictions = box_predictions[box_predictor.MASK_PREDICTIONS] mask_predictions = box_predictions[box_predictor.MASK_PREDICTIONS]
self.assertListEqual([2, 1, 5, 14, 14], self.assertListEqual([2, 1, 5, 14, 14],
mask_predictions.get_shape().as_list()) mask_predictions.get_shape().as_list())
def test_do_not_return_instance_masks_and_keypoints_without_request(self): def test_do_not_return_instance_masks_without_request(self):
image_features = tf.random_uniform([2, 7, 7, 3], dtype=tf.float32) image_features = tf.random_uniform([2, 7, 7, 3], dtype=tf.float32)
mask_box_predictor = box_predictor.MaskRCNNBoxPredictor( mask_box_predictor = box_predictor.MaskRCNNBoxPredictor(
is_training=False, is_training=False,
...@@ -108,7 +113,8 @@ class MaskRCNNBoxPredictorTest(tf.test.TestCase): ...@@ -108,7 +113,8 @@ class MaskRCNNBoxPredictorTest(tf.test.TestCase):
dropout_keep_prob=0.5, dropout_keep_prob=0.5,
box_code_size=4) box_code_size=4)
box_predictions = mask_box_predictor.predict( box_predictions = mask_box_predictor.predict(
image_features, num_predictions_per_location=1, scope='BoxPredictor') [image_features], num_predictions_per_location=[1],
scope='BoxPredictor')
self.assertEqual(len(box_predictions), 2) self.assertEqual(len(box_predictions), 2)
self.assertTrue(box_predictor.BOX_ENCODINGS in box_predictions) self.assertTrue(box_predictor.BOX_ENCODINGS in box_predictions)
self.assertTrue(box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND self.assertTrue(box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND
...@@ -156,7 +162,8 @@ class RfcnBoxPredictorTest(tf.test.TestCase): ...@@ -156,7 +162,8 @@ class RfcnBoxPredictorTest(tf.test.TestCase):
box_code_size=4 box_code_size=4
) )
box_predictions = rfcn_box_predictor.predict( box_predictions = rfcn_box_predictor.predict(
image_features, num_predictions_per_location=1, scope='BoxPredictor', [image_features], num_predictions_per_location=[1],
scope='BoxPredictor',
proposal_boxes=proposal_boxes) proposal_boxes=proposal_boxes)
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS] box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
class_predictions_with_background = box_predictions[ class_predictions_with_background = box_predictions[
...@@ -173,7 +180,7 @@ class RfcnBoxPredictorTest(tf.test.TestCase): ...@@ -173,7 +180,7 @@ class RfcnBoxPredictorTest(tf.test.TestCase):
self.assertAllEqual(class_predictions_shape, [8, 1, 3]) self.assertAllEqual(class_predictions_shape, [8, 1, 3])
class ConvolutionalBoxPredictorTest(tf.test.TestCase): class ConvolutionalBoxPredictorTest(test_case.TestCase):
def _build_arg_scope_with_conv_hyperparams(self): def _build_arg_scope_with_conv_hyperparams(self):
conv_hyperparams = hyperparams_pb2.Hyperparams() conv_hyperparams = hyperparams_pb2.Hyperparams()
...@@ -192,36 +199,94 @@ class ConvolutionalBoxPredictorTest(tf.test.TestCase): ...@@ -192,36 +199,94 @@ class ConvolutionalBoxPredictorTest(tf.test.TestCase):
return hyperparams_builder.build(conv_hyperparams, is_training=True) return hyperparams_builder.build(conv_hyperparams, is_training=True)
def test_get_boxes_for_five_aspect_ratios_per_location(self): def test_get_boxes_for_five_aspect_ratios_per_location(self):
image_features = tf.random_uniform([4, 8, 8, 64], dtype=tf.float32) def graph_fn(image_features):
conv_box_predictor = box_predictor.ConvolutionalBoxPredictor( conv_box_predictor = box_predictor.ConvolutionalBoxPredictor(
is_training=False, is_training=False,
num_classes=0, num_classes=0,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(), conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
min_depth=0, min_depth=0,
max_depth=32, max_depth=32,
num_layers_before_predictor=1, num_layers_before_predictor=1,
use_dropout=True, use_dropout=True,
dropout_keep_prob=0.8, dropout_keep_prob=0.8,
kernel_size=1, kernel_size=1,
box_code_size=4 box_code_size=4
) )
box_predictions = conv_box_predictor.predict( box_predictions = conv_box_predictor.predict(
image_features, num_predictions_per_location=5, scope='BoxPredictor') [image_features], num_predictions_per_location=[5],
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS] scope='BoxPredictor')
objectness_predictions = box_predictions[ box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND] objectness_predictions = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
init_op = tf.global_variables_initializer() return (box_encodings, objectness_predictions)
with self.test_session() as sess: image_features = np.random.rand(4, 8, 8, 64).astype(np.float32)
sess.run(init_op) (box_encodings, objectness_predictions) = self.execute(graph_fn,
(box_encodings_shape, [image_features])
objectness_predictions_shape) = sess.run( self.assertAllEqual(box_encodings.shape, [4, 320, 1, 4])
[tf.shape(box_encodings), tf.shape(objectness_predictions)]) self.assertAllEqual(objectness_predictions.shape, [4, 320, 1])
self.assertAllEqual(box_encodings_shape, [4, 320, 1, 4])
self.assertAllEqual(objectness_predictions_shape, [4, 320, 1])
def test_get_boxes_for_one_aspect_ratio_per_location(self): def test_get_boxes_for_one_aspect_ratio_per_location(self):
image_features = tf.random_uniform([4, 8, 8, 64], dtype=tf.float32) def graph_fn(image_features):
conv_box_predictor = box_predictor.ConvolutionalBoxPredictor(
is_training=False,
num_classes=0,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
min_depth=0,
max_depth=32,
num_layers_before_predictor=1,
use_dropout=True,
dropout_keep_prob=0.8,
kernel_size=1,
box_code_size=4
)
box_predictions = conv_box_predictor.predict(
[image_features], num_predictions_per_location=[1],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
objectness_predictions = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
return (box_encodings, objectness_predictions)
image_features = np.random.rand(4, 8, 8, 64).astype(np.float32)
(box_encodings, objectness_predictions) = self.execute(graph_fn,
[image_features])
self.assertAllEqual(box_encodings.shape, [4, 64, 1, 4])
self.assertAllEqual(objectness_predictions.shape, [4, 64, 1])
def test_get_multi_class_predictions_for_five_aspect_ratios_per_location(
self):
num_classes_without_background = 6
image_features = np.random.rand(4, 8, 8, 64).astype(np.float32)
def graph_fn(image_features):
conv_box_predictor = box_predictor.ConvolutionalBoxPredictor(
is_training=False,
num_classes=num_classes_without_background,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
min_depth=0,
max_depth=32,
num_layers_before_predictor=1,
use_dropout=True,
dropout_keep_prob=0.8,
kernel_size=1,
box_code_size=4
)
box_predictions = conv_box_predictor.predict(
[image_features],
num_predictions_per_location=[5],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
class_predictions_with_background = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
return (box_encodings, class_predictions_with_background)
(box_encodings,
class_predictions_with_background) = self.execute(graph_fn,
[image_features])
self.assertAllEqual(box_encodings.shape, [4, 320, 1, 4])
self.assertAllEqual(class_predictions_with_background.shape,
[4, 320, num_classes_without_background+1])
def test_get_predictions_with_feature_maps_of_dynamic_shape(
self):
image_features = tf.placeholder(dtype=tf.float32, shape=[4, None, None, 64])
conv_box_predictor = box_predictor.ConvolutionalBoxPredictor( conv_box_predictor = box_predictor.ConvolutionalBoxPredictor(
is_training=False, is_training=False,
num_classes=0, num_classes=0,
...@@ -235,71 +300,177 @@ class ConvolutionalBoxPredictorTest(tf.test.TestCase): ...@@ -235,71 +300,177 @@ class ConvolutionalBoxPredictorTest(tf.test.TestCase):
box_code_size=4 box_code_size=4
) )
box_predictions = conv_box_predictor.predict( box_predictions = conv_box_predictor.predict(
image_features, num_predictions_per_location=1, scope='BoxPredictor') [image_features], num_predictions_per_location=[5],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS] box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
objectness_predictions = box_predictions[ objectness_predictions = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND] box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
init_op = tf.global_variables_initializer() init_op = tf.global_variables_initializer()
resolution = 32
expected_num_anchors = resolution*resolution*5
with self.test_session() as sess: with self.test_session() as sess:
sess.run(init_op) sess.run(init_op)
(box_encodings_shape, (box_encodings_shape,
objectness_predictions_shape) = sess.run( objectness_predictions_shape) = sess.run(
[tf.shape(box_encodings), tf.shape(objectness_predictions)]) [tf.shape(box_encodings), tf.shape(objectness_predictions)],
self.assertAllEqual(box_encodings_shape, [4, 64, 1, 4]) feed_dict={image_features:
self.assertAllEqual(objectness_predictions_shape, [4, 64, 1]) np.random.rand(4, resolution, resolution, 64)})
self.assertAllEqual(box_encodings_shape, [4, expected_num_anchors, 1, 4])
self.assertAllEqual(objectness_predictions_shape,
[4, expected_num_anchors, 1])
class WeightSharedConvolutionalBoxPredictorTest(test_case.TestCase):
def _build_arg_scope_with_conv_hyperparams(self):
conv_hyperparams = hyperparams_pb2.Hyperparams()
conv_hyperparams_text_proto = """
activation: RELU_6
regularizer {
l2_regularizer {
}
}
initializer {
truncated_normal_initializer {
}
}
"""
text_format.Merge(conv_hyperparams_text_proto, conv_hyperparams)
return hyperparams_builder.build(conv_hyperparams, is_training=True)
def test_get_boxes_for_five_aspect_ratios_per_location(self):
def graph_fn(image_features):
conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=False,
num_classes=0,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
depth=32,
num_layers_before_predictor=1,
box_code_size=4)
box_predictions = conv_box_predictor.predict(
[image_features], num_predictions_per_location=[5],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
objectness_predictions = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
return (box_encodings, objectness_predictions)
image_features = np.random.rand(4, 8, 8, 64).astype(np.float32)
(box_encodings, objectness_predictions) = self.execute(
graph_fn, [image_features])
self.assertAllEqual(box_encodings.shape, [4, 320, 1, 4])
self.assertAllEqual(objectness_predictions.shape, [4, 320, 1])
def test_get_multi_class_predictions_for_five_aspect_ratios_per_location( def test_get_multi_class_predictions_for_five_aspect_ratios_per_location(
self): self):
num_classes_without_background = 6 num_classes_without_background = 6
image_features = tf.random_uniform([4, 8, 8, 64], dtype=tf.float32) def graph_fn(image_features):
conv_box_predictor = box_predictor.ConvolutionalBoxPredictor( conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=False, is_training=False,
num_classes=num_classes_without_background, num_classes=num_classes_without_background,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(), conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
min_depth=0, depth=32,
max_depth=32, num_layers_before_predictor=1,
num_layers_before_predictor=1, box_code_size=4)
use_dropout=True, box_predictions = conv_box_predictor.predict(
dropout_keep_prob=0.8, [image_features],
kernel_size=1, num_predictions_per_location=[5],
box_code_size=4 scope='BoxPredictor')
) box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
box_predictions = conv_box_predictor.predict( class_predictions_with_background = box_predictions[
image_features, box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
num_predictions_per_location=5, return (box_encodings, class_predictions_with_background)
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
class_predictions_with_background = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
init_op = tf.global_variables_initializer() image_features = np.random.rand(4, 8, 8, 64).astype(np.float32)
with self.test_session() as sess: (box_encodings, class_predictions_with_background) = self.execute(
sess.run(init_op) graph_fn, [image_features])
(box_encodings_shape, class_predictions_with_background_shape self.assertAllEqual(box_encodings.shape, [4, 320, 1, 4])
) = sess.run([ self.assertAllEqual(class_predictions_with_background.shape,
tf.shape(box_encodings), tf.shape(class_predictions_with_background)]) [4, 320, num_classes_without_background+1])
self.assertAllEqual(box_encodings_shape, [4, 320, 1, 4])
self.assertAllEqual(class_predictions_with_background_shape, def test_get_multi_class_predictions_from_two_feature_maps(
[4, 320, num_classes_without_background+1]) self):
def test_get_boxes_for_five_aspect_ratios_per_location_fully_convolutional( num_classes_without_background = 6
def graph_fn(image_features1, image_features2):
conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=False,
num_classes=num_classes_without_background,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
depth=32,
num_layers_before_predictor=1,
box_code_size=4)
box_predictions = conv_box_predictor.predict(
[image_features1, image_features2],
num_predictions_per_location=[5, 5],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
class_predictions_with_background = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
return (box_encodings, class_predictions_with_background)
image_features1 = np.random.rand(4, 8, 8, 64).astype(np.float32)
image_features2 = np.random.rand(4, 8, 8, 64).astype(np.float32)
(box_encodings, class_predictions_with_background) = self.execute(
graph_fn, [image_features1, image_features2])
self.assertAllEqual(box_encodings.shape, [4, 640, 1, 4])
self.assertAllEqual(class_predictions_with_background.shape,
[4, 640, num_classes_without_background+1])
def test_predictions_from_multiple_feature_maps_share_weights(self):
num_classes_without_background = 6
def graph_fn(image_features1, image_features2):
conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=False,
num_classes=num_classes_without_background,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
depth=32,
num_layers_before_predictor=2,
box_code_size=4)
box_predictions = conv_box_predictor.predict(
[image_features1, image_features2],
num_predictions_per_location=[5, 5],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
class_predictions_with_background = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
return (box_encodings, class_predictions_with_background)
with self.test_session(graph=tf.Graph()):
graph_fn(tf.random_uniform([4, 32, 32, 3], dtype=tf.float32),
tf.random_uniform([4, 32, 32, 3], dtype=tf.float32))
actual_variable_set = set(
[var.op.name for var in tf.trainable_variables()])
expected_variable_set = set([
'BoxPredictor/WeightSharedConvolutionalBoxPredictor/conv2d_0/weights',
'BoxPredictor/WeightSharedConvolutionalBoxPredictor/conv2d_0/biases',
'BoxPredictor/WeightSharedConvolutionalBoxPredictor/conv2d_1/weights',
'BoxPredictor/WeightSharedConvolutionalBoxPredictor/conv2d_1/biases',
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
'BoxEncodingPredictor/weights'),
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
'BoxEncodingPredictor/biases'),
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
'ClassPredictor/weights'),
('BoxPredictor/WeightSharedConvolutionalBoxPredictor/'
'ClassPredictor/biases')])
self.assertEqual(expected_variable_set, actual_variable_set)
def test_get_predictions_with_feature_maps_of_dynamic_shape(
self): self):
image_features = tf.placeholder(dtype=tf.float32, shape=[4, None, None, 64]) image_features = tf.placeholder(dtype=tf.float32, shape=[4, None, None, 64])
conv_box_predictor = box_predictor.ConvolutionalBoxPredictor( conv_box_predictor = box_predictor.WeightSharedConvolutionalBoxPredictor(
is_training=False, is_training=False,
num_classes=0, num_classes=0,
conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(), conv_hyperparams=self._build_arg_scope_with_conv_hyperparams(),
min_depth=0, depth=32,
max_depth=32,
num_layers_before_predictor=1, num_layers_before_predictor=1,
use_dropout=True, box_code_size=4)
dropout_keep_prob=0.8,
kernel_size=1,
box_code_size=4
)
box_predictions = conv_box_predictor.predict( box_predictions = conv_box_predictor.predict(
image_features, num_predictions_per_location=5, scope='BoxPredictor') [image_features], num_predictions_per_location=[5],
scope='BoxPredictor')
box_encodings = box_predictions[box_predictor.BOX_ENCODINGS] box_encodings = box_predictions[box_predictor.BOX_ENCODINGS]
objectness_predictions = box_predictions[ objectness_predictions = box_predictions[
box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND] box_predictor.CLASS_PREDICTIONS_WITH_BACKGROUND]
......
...@@ -50,8 +50,10 @@ class Loss(object): ...@@ -50,8 +50,10 @@ class Loss(object):
"""Call the loss function. """Call the loss function.
Args: Args:
prediction_tensor: a tensor representing predicted quantities. prediction_tensor: an N-d tensor of shape [batch, anchors, ...]
target_tensor: a tensor representing regression or classification targets. representing predicted quantities.
target_tensor: an N-d tensor of shape [batch, anchors, ...] representing
regression or classification targets.
ignore_nan_targets: whether to ignore nan targets in the loss computation. ignore_nan_targets: whether to ignore nan targets in the loss computation.
E.g. can be used if the target tensor is missing groundtruth data that E.g. can be used if the target tensor is missing groundtruth data that
shouldn't be factored into the loss. shouldn't be factored into the loss.
...@@ -81,7 +83,8 @@ class Loss(object): ...@@ -81,7 +83,8 @@ class Loss(object):
the Loss. the Loss.
Returns: Returns:
loss: a tensor representing the value of the loss function loss: an N-d tensor of shape [batch, anchors, ...] containing the loss per
anchor
""" """
pass pass
...@@ -92,15 +95,6 @@ class WeightedL2LocalizationLoss(Loss): ...@@ -92,15 +95,6 @@ class WeightedL2LocalizationLoss(Loss):
Loss[b,a] = .5 * ||weights[b,a] * (prediction[b,a,:] - target[b,a,:])||^2 Loss[b,a] = .5 * ||weights[b,a] * (prediction[b,a,:] - target[b,a,:])||^2
""" """
def __init__(self, anchorwise_output=False):
"""Constructor.
Args:
anchorwise_output: Outputs loss per anchor. (default False)
"""
self._anchorwise_output = anchorwise_output
def _compute_loss(self, prediction_tensor, target_tensor, weights): def _compute_loss(self, prediction_tensor, target_tensor, weights):
"""Compute loss function. """Compute loss function.
...@@ -112,15 +106,13 @@ class WeightedL2LocalizationLoss(Loss): ...@@ -112,15 +106,13 @@ class WeightedL2LocalizationLoss(Loss):
weights: a float tensor of shape [batch_size, num_anchors] weights: a float tensor of shape [batch_size, num_anchors]
Returns: Returns:
loss: a (scalar) tensor representing the value of the loss function loss: a float tensor of shape [batch_size, num_anchors] tensor
or a float tensor of shape [batch_size, num_anchors] representing the value of the loss function.
""" """
weighted_diff = (prediction_tensor - target_tensor) * tf.expand_dims( weighted_diff = (prediction_tensor - target_tensor) * tf.expand_dims(
weights, 2) weights, 2)
square_diff = 0.5 * tf.square(weighted_diff) square_diff = 0.5 * tf.square(weighted_diff)
if self._anchorwise_output: return tf.reduce_sum(square_diff, 2)
return tf.reduce_sum(square_diff, 2)
return tf.reduce_sum(square_diff)
class WeightedSmoothL1LocalizationLoss(Loss): class WeightedSmoothL1LocalizationLoss(Loss):
...@@ -132,15 +124,6 @@ class WeightedSmoothL1LocalizationLoss(Loss): ...@@ -132,15 +124,6 @@ class WeightedSmoothL1LocalizationLoss(Loss):
See also Equation (3) in the Fast R-CNN paper by Ross Girshick (ICCV 2015) See also Equation (3) in the Fast R-CNN paper by Ross Girshick (ICCV 2015)
""" """
def __init__(self, anchorwise_output=False):
"""Constructor.
Args:
anchorwise_output: Outputs loss per anchor. (default False)
"""
self._anchorwise_output = anchorwise_output
def _compute_loss(self, prediction_tensor, target_tensor, weights): def _compute_loss(self, prediction_tensor, target_tensor, weights):
"""Compute loss function. """Compute loss function.
...@@ -152,7 +135,8 @@ class WeightedSmoothL1LocalizationLoss(Loss): ...@@ -152,7 +135,8 @@ class WeightedSmoothL1LocalizationLoss(Loss):
weights: a float tensor of shape [batch_size, num_anchors] weights: a float tensor of shape [batch_size, num_anchors]
Returns: Returns:
loss: a (scalar) tensor representing the value of the loss function loss: a float tensor of shape [batch_size, num_anchors] tensor
representing the value of the loss function.
""" """
diff = prediction_tensor - target_tensor diff = prediction_tensor - target_tensor
abs_diff = tf.abs(diff) abs_diff = tf.abs(diff)
...@@ -160,9 +144,7 @@ class WeightedSmoothL1LocalizationLoss(Loss): ...@@ -160,9 +144,7 @@ class WeightedSmoothL1LocalizationLoss(Loss):
anchorwise_smooth_l1norm = tf.reduce_sum( anchorwise_smooth_l1norm = tf.reduce_sum(
tf.where(abs_diff_lt_1, 0.5 * tf.square(abs_diff), abs_diff - 0.5), tf.where(abs_diff_lt_1, 0.5 * tf.square(abs_diff), abs_diff - 0.5),
2) * weights 2) * weights
if self._anchorwise_output: return anchorwise_smooth_l1norm
return anchorwise_smooth_l1norm
return tf.reduce_sum(anchorwise_smooth_l1norm)
class WeightedIOULocalizationLoss(Loss): class WeightedIOULocalizationLoss(Loss):
...@@ -184,27 +166,19 @@ class WeightedIOULocalizationLoss(Loss): ...@@ -184,27 +166,19 @@ class WeightedIOULocalizationLoss(Loss):
weights: a float tensor of shape [batch_size, num_anchors] weights: a float tensor of shape [batch_size, num_anchors]
Returns: Returns:
loss: a (scalar) tensor representing the value of the loss function loss: a float tensor of shape [batch_size, num_anchors] tensor
representing the value of the loss function.
""" """
predicted_boxes = box_list.BoxList(tf.reshape(prediction_tensor, [-1, 4])) predicted_boxes = box_list.BoxList(tf.reshape(prediction_tensor, [-1, 4]))
target_boxes = box_list.BoxList(tf.reshape(target_tensor, [-1, 4])) target_boxes = box_list.BoxList(tf.reshape(target_tensor, [-1, 4]))
per_anchor_iou_loss = 1.0 - box_list_ops.matched_iou(predicted_boxes, per_anchor_iou_loss = 1.0 - box_list_ops.matched_iou(predicted_boxes,
target_boxes) target_boxes)
return tf.reduce_sum(tf.reshape(weights, [-1]) * per_anchor_iou_loss) return tf.reshape(weights, [-1]) * per_anchor_iou_loss
class WeightedSigmoidClassificationLoss(Loss): class WeightedSigmoidClassificationLoss(Loss):
"""Sigmoid cross entropy classification loss function.""" """Sigmoid cross entropy classification loss function."""
def __init__(self, anchorwise_output=False):
"""Constructor.
Args:
anchorwise_output: Outputs loss per anchor. (default False)
"""
self._anchorwise_output = anchorwise_output
def _compute_loss(self, def _compute_loss(self,
prediction_tensor, prediction_tensor,
target_tensor, target_tensor,
...@@ -222,8 +196,8 @@ class WeightedSigmoidClassificationLoss(Loss): ...@@ -222,8 +196,8 @@ class WeightedSigmoidClassificationLoss(Loss):
If provided, computes loss only for the specified class indices. If provided, computes loss only for the specified class indices.
Returns: Returns:
loss: a (scalar) tensor representing the value of the loss function loss: a float tensor of shape [batch_size, num_anchors, num_classes]
or a float tensor of shape [batch_size, num_anchors] representing the value of the loss function.
""" """
weights = tf.expand_dims(weights, 2) weights = tf.expand_dims(weights, 2)
if class_indices is not None: if class_indices is not None:
...@@ -233,9 +207,7 @@ class WeightedSigmoidClassificationLoss(Loss): ...@@ -233,9 +207,7 @@ class WeightedSigmoidClassificationLoss(Loss):
[1, 1, -1]) [1, 1, -1])
per_entry_cross_ent = (tf.nn.sigmoid_cross_entropy_with_logits( per_entry_cross_ent = (tf.nn.sigmoid_cross_entropy_with_logits(
labels=target_tensor, logits=prediction_tensor)) labels=target_tensor, logits=prediction_tensor))
if self._anchorwise_output: return per_entry_cross_ent * weights
return tf.reduce_sum(per_entry_cross_ent * weights, 2)
return tf.reduce_sum(per_entry_cross_ent * weights)
class SigmoidFocalClassificationLoss(Loss): class SigmoidFocalClassificationLoss(Loss):
...@@ -245,15 +217,13 @@ class SigmoidFocalClassificationLoss(Loss): ...@@ -245,15 +217,13 @@ class SigmoidFocalClassificationLoss(Loss):
examples. See https://arxiv.org/pdf/1708.02002.pdf for the loss definition. examples. See https://arxiv.org/pdf/1708.02002.pdf for the loss definition.
""" """
def __init__(self, anchorwise_output=False, gamma=2.0, alpha=0.25): def __init__(self, gamma=2.0, alpha=0.25):
"""Constructor. """Constructor.
Args: Args:
anchorwise_output: Outputs loss per anchor. (default False)
gamma: exponent of the modulating factor (1 - p_t) ^ gamma. gamma: exponent of the modulating factor (1 - p_t) ^ gamma.
alpha: optional alpha weighting factor to balance positives vs negatives. alpha: optional alpha weighting factor to balance positives vs negatives.
""" """
self._anchorwise_output = anchorwise_output
self._alpha = alpha self._alpha = alpha
self._gamma = gamma self._gamma = gamma
...@@ -274,8 +244,8 @@ class SigmoidFocalClassificationLoss(Loss): ...@@ -274,8 +244,8 @@ class SigmoidFocalClassificationLoss(Loss):
If provided, computes loss only for the specified class indices. If provided, computes loss only for the specified class indices.
Returns: Returns:
loss: a (scalar) tensor representing the value of the loss function loss: a float tensor of shape [batch_size, num_anchors, num_classes]
or a float tensor of shape [batch_size, num_anchors] representing the value of the loss function.
""" """
weights = tf.expand_dims(weights, 2) weights = tf.expand_dims(weights, 2)
if class_indices is not None: if class_indices is not None:
...@@ -297,25 +267,21 @@ class SigmoidFocalClassificationLoss(Loss): ...@@ -297,25 +267,21 @@ class SigmoidFocalClassificationLoss(Loss):
(1 - target_tensor) * (1 - self._alpha)) (1 - target_tensor) * (1 - self._alpha))
focal_cross_entropy_loss = (modulating_factor * alpha_weight_factor * focal_cross_entropy_loss = (modulating_factor * alpha_weight_factor *
per_entry_cross_ent) per_entry_cross_ent)
if self._anchorwise_output: return focal_cross_entropy_loss * weights
return tf.reduce_sum(focal_cross_entropy_loss * weights, 2)
return tf.reduce_sum(focal_cross_entropy_loss * weights)
class WeightedSoftmaxClassificationLoss(Loss): class WeightedSoftmaxClassificationLoss(Loss):
"""Softmax loss function.""" """Softmax loss function."""
def __init__(self, anchorwise_output=False, logit_scale=1.0): def __init__(self, logit_scale=1.0):
"""Constructor. """Constructor.
Args: Args:
anchorwise_output: Whether to output loss per anchor (default False)
logit_scale: When this value is high, the prediction is "diffused" and logit_scale: When this value is high, the prediction is "diffused" and
when this value is low, the prediction is made peakier. when this value is low, the prediction is made peakier.
(default 1.0) (default 1.0)
""" """
self._anchorwise_output = anchorwise_output
self._logit_scale = logit_scale self._logit_scale = logit_scale
def _compute_loss(self, prediction_tensor, target_tensor, weights): def _compute_loss(self, prediction_tensor, target_tensor, weights):
...@@ -329,7 +295,8 @@ class WeightedSoftmaxClassificationLoss(Loss): ...@@ -329,7 +295,8 @@ class WeightedSoftmaxClassificationLoss(Loss):
weights: a float tensor of shape [batch_size, num_anchors] weights: a float tensor of shape [batch_size, num_anchors]
Returns: Returns:
loss: a (scalar) tensor representing the value of the loss function loss: a float tensor of shape [batch_size, num_anchors]
representing the value of the loss function.
""" """
num_classes = prediction_tensor.get_shape().as_list()[-1] num_classes = prediction_tensor.get_shape().as_list()[-1]
prediction_tensor = tf.divide( prediction_tensor = tf.divide(
...@@ -337,9 +304,7 @@ class WeightedSoftmaxClassificationLoss(Loss): ...@@ -337,9 +304,7 @@ class WeightedSoftmaxClassificationLoss(Loss):
per_row_cross_ent = (tf.nn.softmax_cross_entropy_with_logits( per_row_cross_ent = (tf.nn.softmax_cross_entropy_with_logits(
labels=tf.reshape(target_tensor, [-1, num_classes]), labels=tf.reshape(target_tensor, [-1, num_classes]),
logits=tf.reshape(prediction_tensor, [-1, num_classes]))) logits=tf.reshape(prediction_tensor, [-1, num_classes])))
if self._anchorwise_output: return tf.reshape(per_row_cross_ent, tf.shape(weights)) * weights
return tf.reshape(per_row_cross_ent, tf.shape(weights)) * weights
return tf.reduce_sum(per_row_cross_ent * tf.reshape(weights, [-1]))
class BootstrappedSigmoidClassificationLoss(Loss): class BootstrappedSigmoidClassificationLoss(Loss):
...@@ -359,14 +324,13 @@ class BootstrappedSigmoidClassificationLoss(Loss): ...@@ -359,14 +324,13 @@ class BootstrappedSigmoidClassificationLoss(Loss):
Reed et al. (ICLR 2015). Reed et al. (ICLR 2015).
""" """
def __init__(self, alpha, bootstrap_type='soft', anchorwise_output=False): def __init__(self, alpha, bootstrap_type='soft'):
"""Constructor. """Constructor.
Args: Args:
alpha: a float32 scalar tensor between 0 and 1 representing interpolation alpha: a float32 scalar tensor between 0 and 1 representing interpolation
weight weight
bootstrap_type: set to either 'hard' or 'soft' (default) bootstrap_type: set to either 'hard' or 'soft' (default)
anchorwise_output: Outputs loss per anchor. (default False)
Raises: Raises:
ValueError: if bootstrap_type is not either 'hard' or 'soft' ValueError: if bootstrap_type is not either 'hard' or 'soft'
...@@ -376,7 +340,6 @@ class BootstrappedSigmoidClassificationLoss(Loss): ...@@ -376,7 +340,6 @@ class BootstrappedSigmoidClassificationLoss(Loss):
'\'hard\' or \'soft.\'') '\'hard\' or \'soft.\'')
self._alpha = alpha self._alpha = alpha
self._bootstrap_type = bootstrap_type self._bootstrap_type = bootstrap_type
self._anchorwise_output = anchorwise_output
def _compute_loss(self, prediction_tensor, target_tensor, weights): def _compute_loss(self, prediction_tensor, target_tensor, weights):
"""Compute loss function. """Compute loss function.
...@@ -389,8 +352,8 @@ class BootstrappedSigmoidClassificationLoss(Loss): ...@@ -389,8 +352,8 @@ class BootstrappedSigmoidClassificationLoss(Loss):
weights: a float tensor of shape [batch_size, num_anchors] weights: a float tensor of shape [batch_size, num_anchors]
Returns: Returns:
loss: a (scalar) tensor representing the value of the loss function loss: a float tensor of shape [batch_size, num_anchors, num_classes]
or a float tensor of shape [batch_size, num_anchors] representing the value of the loss function.
""" """
if self._bootstrap_type == 'soft': if self._bootstrap_type == 'soft':
bootstrap_target_tensor = self._alpha * target_tensor + ( bootstrap_target_tensor = self._alpha * target_tensor + (
...@@ -401,9 +364,7 @@ class BootstrappedSigmoidClassificationLoss(Loss): ...@@ -401,9 +364,7 @@ class BootstrappedSigmoidClassificationLoss(Loss):
tf.sigmoid(prediction_tensor) > 0.5, tf.float32) tf.sigmoid(prediction_tensor) > 0.5, tf.float32)
per_entry_cross_ent = (tf.nn.sigmoid_cross_entropy_with_logits( per_entry_cross_ent = (tf.nn.sigmoid_cross_entropy_with_logits(
labels=bootstrap_target_tensor, logits=prediction_tensor)) labels=bootstrap_target_tensor, logits=prediction_tensor))
if self._anchorwise_output: return per_entry_cross_ent * tf.expand_dims(weights, 2)
return tf.reduce_sum(per_entry_cross_ent * tf.expand_dims(weights, 2), 2)
return tf.reduce_sum(per_entry_cross_ent * tf.expand_dims(weights, 2))
class HardExampleMiner(object): class HardExampleMiner(object):
......
...@@ -26,7 +26,7 @@ from object_detection.core import matcher ...@@ -26,7 +26,7 @@ from object_detection.core import matcher
class WeightedL2LocalizationLossTest(tf.test.TestCase): class WeightedL2LocalizationLossTest(tf.test.TestCase):
def testReturnsCorrectLoss(self): def testReturnsCorrectWeightedLoss(self):
batch_size = 3 batch_size = 3
num_anchors = 10 num_anchors = 10
code_size = 4 code_size = 4
...@@ -36,7 +36,8 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase): ...@@ -36,7 +36,8 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase):
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0]], tf.float32) [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]], tf.float32)
loss_op = losses.WeightedL2LocalizationLoss() loss_op = losses.WeightedL2LocalizationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = tf.reduce_sum(loss_op(prediction_tensor, target_tensor,
weights=weights))
expected_loss = (3 * 5 * 4) / 2.0 expected_loss = (3 * 5 * 4) / 2.0
with self.test_session() as sess: with self.test_session() as sess:
...@@ -50,7 +51,7 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase): ...@@ -50,7 +51,7 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase):
prediction_tensor = tf.ones([batch_size, num_anchors, code_size]) prediction_tensor = tf.ones([batch_size, num_anchors, code_size])
target_tensor = tf.zeros([batch_size, num_anchors, code_size]) target_tensor = tf.zeros([batch_size, num_anchors, code_size])
weights = tf.ones([batch_size, num_anchors]) weights = tf.ones([batch_size, num_anchors])
loss_op = losses.WeightedL2LocalizationLoss(anchorwise_output=True) loss_op = losses.WeightedL2LocalizationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
expected_loss = np.ones((batch_size, num_anchors)) * 2 expected_loss = np.ones((batch_size, num_anchors)) * 2
...@@ -58,22 +59,6 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase): ...@@ -58,22 +59,6 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase):
loss_output = sess.run(loss) loss_output = sess.run(loss)
self.assertAllClose(loss_output, expected_loss) self.assertAllClose(loss_output, expected_loss)
def testReturnsCorrectLossSum(self):
batch_size = 3
num_anchors = 16
code_size = 4
prediction_tensor = tf.ones([batch_size, num_anchors, code_size])
target_tensor = tf.zeros([batch_size, num_anchors, code_size])
weights = tf.ones([batch_size, num_anchors])
loss_op = losses.WeightedL2LocalizationLoss(anchorwise_output=False)
loss = loss_op(prediction_tensor, target_tensor, weights=weights)
expected_loss = tf.nn.l2_loss(prediction_tensor - target_tensor)
with self.test_session() as sess:
loss_output = sess.run(loss)
expected_loss_output = sess.run(expected_loss)
self.assertAllClose(loss_output, expected_loss_output)
def testReturnsCorrectNanLoss(self): def testReturnsCorrectNanLoss(self):
batch_size = 3 batch_size = 3
num_anchors = 10 num_anchors = 10
...@@ -87,6 +72,7 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase): ...@@ -87,6 +72,7 @@ class WeightedL2LocalizationLossTest(tf.test.TestCase):
loss_op = losses.WeightedL2LocalizationLoss() loss_op = losses.WeightedL2LocalizationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights, loss = loss_op(prediction_tensor, target_tensor, weights=weights,
ignore_nan_targets=True) ignore_nan_targets=True)
loss = tf.reduce_sum(loss)
expected_loss = (3 * 5 * 4) / 2.0 expected_loss = (3 * 5 * 4) / 2.0
with self.test_session() as sess: with self.test_session() as sess:
...@@ -111,6 +97,7 @@ class WeightedSmoothL1LocalizationLossTest(tf.test.TestCase): ...@@ -111,6 +97,7 @@ class WeightedSmoothL1LocalizationLossTest(tf.test.TestCase):
[0, 3, 0]], tf.float32) [0, 3, 0]], tf.float32)
loss_op = losses.WeightedSmoothL1LocalizationLoss() loss_op = losses.WeightedSmoothL1LocalizationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss)
exp_loss = 7.695 exp_loss = 7.695
with self.test_session() as sess: with self.test_session() as sess:
...@@ -130,6 +117,7 @@ class WeightedIOULocalizationLossTest(tf.test.TestCase): ...@@ -130,6 +117,7 @@ class WeightedIOULocalizationLossTest(tf.test.TestCase):
weights = [[1.0, .5, 2.0]] weights = [[1.0, .5, 2.0]]
loss_op = losses.WeightedIOULocalizationLoss() loss_op = losses.WeightedIOULocalizationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss)
exp_loss = 2.0 exp_loss = 2.0
with self.test_session() as sess: with self.test_session() as sess:
loss_output = sess.run(loss) loss_output = sess.run(loss)
...@@ -159,6 +147,7 @@ class WeightedSigmoidClassificationLossTest(tf.test.TestCase): ...@@ -159,6 +147,7 @@ class WeightedSigmoidClassificationLossTest(tf.test.TestCase):
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
loss_op = losses.WeightedSigmoidClassificationLoss() loss_op = losses.WeightedSigmoidClassificationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss)
exp_loss = -2 * math.log(.5) exp_loss = -2 * math.log(.5)
with self.test_session() as sess: with self.test_session() as sess:
...@@ -184,8 +173,9 @@ class WeightedSigmoidClassificationLossTest(tf.test.TestCase): ...@@ -184,8 +173,9 @@ class WeightedSigmoidClassificationLossTest(tf.test.TestCase):
[1, 0, 0]]], tf.float32) [1, 0, 0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1], weights = tf.constant([[1, 1, 1, 1],
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
loss_op = losses.WeightedSigmoidClassificationLoss(True) loss_op = losses.WeightedSigmoidClassificationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss, axis=2)
exp_loss = np.matrix([[0, 0, -math.log(.5), 0], exp_loss = np.matrix([[0, 0, -math.log(.5), 0],
[-math.log(.5), 0, 0, 0]]) [-math.log(.5), 0, 0, 0]])
...@@ -214,9 +204,10 @@ class WeightedSigmoidClassificationLossTest(tf.test.TestCase): ...@@ -214,9 +204,10 @@ class WeightedSigmoidClassificationLossTest(tf.test.TestCase):
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
# Ignores the last class. # Ignores the last class.
class_indices = tf.constant([0, 1, 2], tf.int32) class_indices = tf.constant([0, 1, 2], tf.int32)
loss_op = losses.WeightedSigmoidClassificationLoss(True) loss_op = losses.WeightedSigmoidClassificationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights, loss = loss_op(prediction_tensor, target_tensor, weights=weights,
class_indices=class_indices) class_indices=class_indices)
loss = tf.reduce_sum(loss, axis=2)
exp_loss = np.matrix([[0, 0, -math.log(.5), 0], exp_loss = np.matrix([[0, 0, -math.log(.5), 0],
[-math.log(.5), 0, 0, 0]]) [-math.log(.5), 0, 0, 0]])
...@@ -245,14 +236,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -245,14 +236,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[0], [0],
[0]]], tf.float32) [0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1, 1, 1]], tf.float32) weights = tf.constant([[1, 1, 1, 1, 1, 1]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(gamma=2.0, alpha=None)
anchorwise_output=True, gamma=2.0, alpha=None) sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss()
sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss( focal_loss = tf.reduce_sum(focal_loss_op(prediction_tensor, target_tensor,
anchorwise_output=True) weights=weights), axis=2)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, sigmoid_loss = tf.reduce_sum(sigmoid_loss_op(prediction_tensor,
weights=weights) target_tensor,
sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor, weights=weights), axis=2)
weights=weights)
with self.test_session() as sess: with self.test_session() as sess:
sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss]) sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss])
...@@ -272,14 +262,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -272,14 +262,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[0], [0],
[0]]], tf.float32) [0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32) weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(gamma=2.0, alpha=None)
anchorwise_output=True, gamma=2.0, alpha=None) sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss()
sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss( focal_loss = tf.reduce_sum(focal_loss_op(prediction_tensor, target_tensor,
anchorwise_output=True) weights=weights), axis=2)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, sigmoid_loss = tf.reduce_sum(sigmoid_loss_op(prediction_tensor,
weights=weights) target_tensor,
sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor, weights=weights), axis=2)
weights=weights)
with self.test_session() as sess: with self.test_session() as sess:
sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss]) sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss])
...@@ -299,14 +288,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -299,14 +288,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[0], [0],
[0]]], tf.float32) [0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32) weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(gamma=2.0, alpha=None)
anchorwise_output=False, gamma=2.0, alpha=None) sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss()
sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss( focal_loss = tf.reduce_sum(focal_loss_op(prediction_tensor, target_tensor,
anchorwise_output=False) weights=weights))
focal_loss = focal_loss_op(prediction_tensor, target_tensor, sigmoid_loss = tf.reduce_sum(sigmoid_loss_op(prediction_tensor,
weights=weights) target_tensor,
sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor, weights=weights))
weights=weights)
with self.test_session() as sess: with self.test_session() as sess:
sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss]) sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss])
...@@ -326,14 +314,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -326,14 +314,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[0], [0],
[0]]], tf.float32) [0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32) weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(gamma=2.0, alpha=1.0)
anchorwise_output=True, gamma=2.0, alpha=1.0) sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss()
sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss( focal_loss = tf.reduce_sum(focal_loss_op(prediction_tensor, target_tensor,
anchorwise_output=True) weights=weights), axis=2)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, sigmoid_loss = tf.reduce_sum(sigmoid_loss_op(prediction_tensor,
weights=weights) target_tensor,
sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor, weights=weights), axis=2)
weights=weights)
with self.test_session() as sess: with self.test_session() as sess:
sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss]) sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss])
...@@ -355,14 +342,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -355,14 +342,13 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[0], [0],
[0]]], tf.float32) [0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32) weights = tf.constant([[1, 1, 1, 1, 1]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(gamma=2.0, alpha=0.0)
anchorwise_output=True, gamma=2.0, alpha=0.0) sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss()
sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss( focal_loss = tf.reduce_sum(focal_loss_op(prediction_tensor, target_tensor,
anchorwise_output=True) weights=weights), axis=2)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, sigmoid_loss = tf.reduce_sum(sigmoid_loss_op(prediction_tensor,
weights=weights) target_tensor,
sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor, weights=weights), axis=2)
weights=weights)
with self.test_session() as sess: with self.test_session() as sess:
sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss]) sigmoid_loss, focal_loss = sess.run([sigmoid_loss, focal_loss])
...@@ -391,10 +377,8 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -391,10 +377,8 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[1, 0, 0]]], tf.float32) [1, 0, 0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1], weights = tf.constant([[1, 1, 1, 1],
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(alpha=0.5, gamma=0.0)
anchorwise_output=True, alpha=0.5, gamma=0.0) sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss()
sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss(
anchorwise_output=True)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, focal_loss = focal_loss_op(prediction_tensor, target_tensor,
weights=weights) weights=weights)
sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor, sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor,
...@@ -423,10 +407,8 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -423,10 +407,8 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[1, 0, 0]]], tf.float32) [1, 0, 0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1], weights = tf.constant([[1, 1, 1, 1],
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(alpha=None, gamma=0.0)
anchorwise_output=True, alpha=None, gamma=0.0) sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss()
sigmoid_loss_op = losses.WeightedSigmoidClassificationLoss(
anchorwise_output=True)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, focal_loss = focal_loss_op(prediction_tensor, target_tensor,
weights=weights) weights=weights)
sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor, sigmoid_loss = sigmoid_loss_op(prediction_tensor, target_tensor,
...@@ -456,11 +438,10 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -456,11 +438,10 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[1, 0, 0]]], tf.float32) [1, 0, 0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1], weights = tf.constant([[1, 1, 1, 1],
[1, 1, 1, 1]], tf.float32) [1, 1, 1, 1]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(alpha=1.0, gamma=0.0)
anchorwise_output=False, alpha=1.0, gamma=0.0)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, focal_loss = tf.reduce_sum(focal_loss_op(prediction_tensor, target_tensor,
weights=weights) weights=weights))
with self.test_session() as sess: with self.test_session() as sess:
focal_loss = sess.run(focal_loss) focal_loss = sess.run(focal_loss)
self.assertAllClose( self.assertAllClose(
...@@ -489,11 +470,10 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase): ...@@ -489,11 +470,10 @@ class SigmoidFocalClassificationLossTest(tf.test.TestCase):
[1, 0, 0]]], tf.float32) [1, 0, 0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1], weights = tf.constant([[1, 1, 1, 1],
[1, 1, 1, 1]], tf.float32) [1, 1, 1, 1]], tf.float32)
focal_loss_op = losses.SigmoidFocalClassificationLoss( focal_loss_op = losses.SigmoidFocalClassificationLoss(alpha=0.75, gamma=0.0)
anchorwise_output=False, alpha=0.75, gamma=0.0)
focal_loss = focal_loss_op(prediction_tensor, target_tensor, focal_loss = tf.reduce_sum(focal_loss_op(prediction_tensor, target_tensor,
weights=weights) weights=weights))
with self.test_session() as sess: with self.test_session() as sess:
focal_loss = sess.run(focal_loss) focal_loss = sess.run(focal_loss)
self.assertAllClose( self.assertAllClose(
...@@ -528,6 +508,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase): ...@@ -528,6 +508,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase):
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
loss_op = losses.WeightedSoftmaxClassificationLoss() loss_op = losses.WeightedSoftmaxClassificationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss)
exp_loss = - 1.5 * math.log(.5) exp_loss = - 1.5 * math.log(.5)
with self.test_session() as sess: with self.test_session() as sess:
...@@ -553,7 +534,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase): ...@@ -553,7 +534,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase):
[1, 0, 0]]], tf.float32) [1, 0, 0]]], tf.float32)
weights = tf.constant([[1, 1, .5, 1], weights = tf.constant([[1, 1, .5, 1],
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
loss_op = losses.WeightedSoftmaxClassificationLoss(True) loss_op = losses.WeightedSoftmaxClassificationLoss()
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
exp_loss = np.matrix([[0, 0, - 0.5 * math.log(.5), 0], exp_loss = np.matrix([[0, 0, - 0.5 * math.log(.5), 0],
...@@ -564,7 +545,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase): ...@@ -564,7 +545,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase):
def testReturnsCorrectAnchorWiseLossWithHighLogitScaleSetting(self): def testReturnsCorrectAnchorWiseLossWithHighLogitScaleSetting(self):
"""At very high logit_scale, all predictions will be ~0.33.""" """At very high logit_scale, all predictions will be ~0.33."""
# TODO(yonib): Also test logit_scale with anchorwise=False. # TODO: Also test logit_scale with anchorwise=False.
logit_scale = 10e16 logit_scale = 10e16
prediction_tensor = tf.constant([[[-100, 100, -100], prediction_tensor = tf.constant([[[-100, 100, -100],
[100, -100, -100], [100, -100, -100],
...@@ -584,8 +565,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase): ...@@ -584,8 +565,7 @@ class WeightedSoftmaxClassificationLossTest(tf.test.TestCase):
[1, 0, 0]]], tf.float32) [1, 0, 0]]], tf.float32)
weights = tf.constant([[1, 1, 1, 1], weights = tf.constant([[1, 1, 1, 1],
[1, 1, 1, 1]], tf.float32) [1, 1, 1, 1]], tf.float32)
loss_op = losses.WeightedSoftmaxClassificationLoss( loss_op = losses.WeightedSoftmaxClassificationLoss(logit_scale=logit_scale)
anchorwise_output=True, logit_scale=logit_scale)
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
uniform_distribution_loss = - math.log(.33333333333) uniform_distribution_loss = - math.log(.33333333333)
...@@ -621,6 +601,7 @@ class BootstrappedSigmoidClassificationLossTest(tf.test.TestCase): ...@@ -621,6 +601,7 @@ class BootstrappedSigmoidClassificationLossTest(tf.test.TestCase):
loss_op = losses.BootstrappedSigmoidClassificationLoss( loss_op = losses.BootstrappedSigmoidClassificationLoss(
alpha, bootstrap_type='soft') alpha, bootstrap_type='soft')
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss)
exp_loss = -math.log(.5) exp_loss = -math.log(.5)
with self.test_session() as sess: with self.test_session() as sess:
loss_output = sess.run(loss) loss_output = sess.run(loss)
...@@ -649,6 +630,7 @@ class BootstrappedSigmoidClassificationLossTest(tf.test.TestCase): ...@@ -649,6 +630,7 @@ class BootstrappedSigmoidClassificationLossTest(tf.test.TestCase):
loss_op = losses.BootstrappedSigmoidClassificationLoss( loss_op = losses.BootstrappedSigmoidClassificationLoss(
alpha, bootstrap_type='hard') alpha, bootstrap_type='hard')
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss)
exp_loss = -math.log(.5) exp_loss = -math.log(.5)
with self.test_session() as sess: with self.test_session() as sess:
loss_output = sess.run(loss) loss_output = sess.run(loss)
...@@ -675,9 +657,9 @@ class BootstrappedSigmoidClassificationLossTest(tf.test.TestCase): ...@@ -675,9 +657,9 @@ class BootstrappedSigmoidClassificationLossTest(tf.test.TestCase):
[1, 1, 1, 0]], tf.float32) [1, 1, 1, 0]], tf.float32)
alpha = tf.constant(.5, tf.float32) alpha = tf.constant(.5, tf.float32)
loss_op = losses.BootstrappedSigmoidClassificationLoss( loss_op = losses.BootstrappedSigmoidClassificationLoss(
alpha, bootstrap_type='hard', anchorwise_output=True) alpha, bootstrap_type='hard')
loss = loss_op(prediction_tensor, target_tensor, weights=weights) loss = loss_op(prediction_tensor, target_tensor, weights=weights)
loss = tf.reduce_sum(loss, axis=2)
exp_loss = np.matrix([[0, 0, -math.log(.5), 0], exp_loss = np.matrix([[0, 0, -math.log(.5), 0],
[-math.log(.5), 0, 0, 0]]) [-math.log(.5), 0, 0, 0]])
with self.test_session() as sess: with self.test_session() as sess:
......
...@@ -168,6 +168,34 @@ class Match(object): ...@@ -168,6 +168,34 @@ class Match(object):
def _reshape_and_cast(self, t): def _reshape_and_cast(self, t):
return tf.cast(tf.reshape(t, [-1]), tf.int32) return tf.cast(tf.reshape(t, [-1]), tf.int32)
def gather_based_on_match(self, input_tensor, unmatched_value,
ignored_value):
"""Gathers elements from `input_tensor` based on match results.
For columns that are matched to a row, gathered_tensor[col] is set to
input_tensor[match_results[col]]. For columns that are unmatched,
gathered_tensor[col] is set to unmatched_value. Finally, for columns that
are ignored gathered_tensor[col] is set to ignored_value.
Note that the input_tensor.shape[1:] must match with unmatched_value.shape
and ignored_value.shape
Args:
input_tensor: Tensor to gather values from.
unmatched_value: Constant tensor value for unmatched columns.
ignored_value: Constant tensor value for ignored columns.
Returns:
gathered_tensor: A tensor containing values gathered from input_tensor.
The shape of the gathered tensor is [match_results.shape[0]] +
input_tensor.shape[1:].
"""
input_tensor = tf.concat([tf.stack([ignored_value, unmatched_value]),
input_tensor], axis=0)
gather_indices = tf.maximum(self.match_results + 2, 0)
gathered_tensor = tf.gather(input_tensor, gather_indices)
return gathered_tensor
class Matcher(object): class Matcher(object):
"""Abstract base class for matcher. """Abstract base class for matcher.
...@@ -195,7 +223,7 @@ class Matcher(object): ...@@ -195,7 +223,7 @@ class Matcher(object):
@abstractmethod @abstractmethod
def _match(self, similarity_matrix, **params): def _match(self, similarity_matrix, **params):
"""Method to be overriden by implementations. """Method to be overridden by implementations.
Args: Args:
similarity_matrix: Float tensor of shape [N, M] with pairwise similarity similarity_matrix: Float tensor of shape [N, M] with pairwise similarity
......
...@@ -20,7 +20,7 @@ import tensorflow as tf ...@@ -20,7 +20,7 @@ import tensorflow as tf
from object_detection.core import matcher from object_detection.core import matcher
class AnchorMatcherTest(tf.test.TestCase): class MatchTest(tf.test.TestCase):
def test_get_correct_matched_columnIndices(self): def test_get_correct_matched_columnIndices(self):
match_results = tf.constant([3, 1, -1, 0, -1, 5, -2]) match_results = tf.constant([3, 1, -1, 0, -1, 5, -2])
...@@ -145,6 +145,32 @@ class AnchorMatcherTest(tf.test.TestCase): ...@@ -145,6 +145,32 @@ class AnchorMatcherTest(tf.test.TestCase):
self.assertAllEqual(all_indices_sorted, self.assertAllEqual(all_indices_sorted,
np.arange(num_matches, dtype=np.int32)) np.arange(num_matches, dtype=np.int32))
def test_scalar_gather_based_on_match(self):
match_results = tf.constant([3, 1, -1, 0, -1, 5, -2])
input_tensor = tf.constant([0, 1, 2, 3, 4, 5, 6, 7], dtype=tf.float32)
expected_gathered_tensor = [3, 1, 100, 0, 100, 5, 200]
match = matcher.Match(match_results)
gathered_tensor = match.gather_based_on_match(input_tensor,
unmatched_value=100.,
ignored_value=200.)
self.assertEquals(gathered_tensor.dtype, tf.float32)
with self.test_session():
gathered_tensor_out = gathered_tensor.eval()
self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out)
def test_multidimensional_gather_based_on_match(self):
match_results = tf.constant([1, -1, -2])
input_tensor = tf.constant([[0, 0.5, 0, 0.5], [0, 0, 0.5, 0.5]],
dtype=tf.float32)
expected_gathered_tensor = [[0, 0, 0.5, 0.5], [0, 0, 0, 0], [0, 0, 0, 0]]
match = matcher.Match(match_results)
gathered_tensor = match.gather_based_on_match(input_tensor,
unmatched_value=tf.zeros(4),
ignored_value=tf.zeros(4))
self.assertEquals(gathered_tensor.dtype, tf.float32)
with self.test_session():
gathered_tensor_out = gathered_tensor.eval()
self.assertAllEqual(expected_gathered_tensor, gathered_tensor_out)
if __name__ == '__main__': if __name__ == '__main__':
tf.test.main() tf.test.main()
...@@ -39,6 +39,17 @@ resize/reshaping necessary (see docstring for the preprocess function). ...@@ -39,6 +39,17 @@ resize/reshaping necessary (see docstring for the preprocess function).
Output classes are always integers in the range [0, num_classes). Any mapping Output classes are always integers in the range [0, num_classes). Any mapping
of these integers to semantic labels is to be handled outside of this class. of these integers to semantic labels is to be handled outside of this class.
Images are resized in the `preprocess` method. All of `preprocess`, `predict`,
and `postprocess` should be stateless.
The `preprocess` method runs `image_resizer_fn` that returns resized_images and
`true_image_shapes`. Since `image_resizer_fn` can pad the images with zeros,
true_image_shapes indicate the slices that contain the image without padding.
This is useful for padding images to be a fixed size for batching.
The `postprocess` method uses the true image shapes to clip predictions that lie
outside of images.
By default, DetectionModels produce bounding box detections; However, we support By default, DetectionModels produce bounding box detections; However, we support
a handful of auxiliary annotations associated with each bounding box, namely, a handful of auxiliary annotations associated with each bounding box, namely,
instance masks and keypoints. instance masks and keypoints.
...@@ -106,12 +117,12 @@ class DetectionModel(object): ...@@ -106,12 +117,12 @@ class DetectionModel(object):
This function is responsible for any scaling/shifting of input values that This function is responsible for any scaling/shifting of input values that
is necessary prior to running the detector on an input image. is necessary prior to running the detector on an input image.
It is also responsible for any resizing that might be necessary as images It is also responsible for any resizing, padding that might be necessary
are assumed to arrive in arbitrary sizes. While this function could as images are assumed to arrive in arbitrary sizes. While this function
conceivably be part of the predict method (below), it is often convenient could conceivably be part of the predict method (below), it is often
to keep these separate --- for example, we may want to preprocess on one convenient to keep these separate --- for example, we may want to preprocess
device, place onto a queue, and let another device (e.g., the GPU) handle on one device, place onto a queue, and let another device (e.g., the GPU)
prediction. handle prediction.
A few important notes about the preprocess function: A few important notes about the preprocess function:
+ We assume that this operation does not have any trainable variables nor + We assume that this operation does not have any trainable variables nor
...@@ -134,11 +145,15 @@ class DetectionModel(object): ...@@ -134,11 +145,15 @@ class DetectionModel(object):
Returns: Returns:
preprocessed_inputs: a [batch, height_out, width_out, channels] float32 preprocessed_inputs: a [batch, height_out, width_out, channels] float32
tensor representing a batch of images. tensor representing a batch of images.
true_image_shapes: int32 tensor of shape [batch, 3] where each row is
of the form [height, width, channels] indicating the shapes
of true images in the resized images, as resized images can be padded
with zeros.
""" """
pass pass
@abstractmethod @abstractmethod
def predict(self, preprocessed_inputs): def predict(self, preprocessed_inputs, true_image_shapes):
"""Predict prediction tensors from inputs tensor. """Predict prediction tensors from inputs tensor.
Outputs of this function can be passed to loss or postprocess functions. Outputs of this function can be passed to loss or postprocess functions.
...@@ -146,6 +161,10 @@ class DetectionModel(object): ...@@ -146,6 +161,10 @@ class DetectionModel(object):
Args: Args:
preprocessed_inputs: a [batch, height, width, channels] float32 tensor preprocessed_inputs: a [batch, height, width, channels] float32 tensor
representing a batch of images. representing a batch of images.
true_image_shapes: int32 tensor of shape [batch, 3] where each row is
of the form [height, width, channels] indicating the shapes
of true images in the resized images, as resized images can be padded
with zeros.
Returns: Returns:
prediction_dict: a dictionary holding prediction tensors to be prediction_dict: a dictionary holding prediction tensors to be
...@@ -154,7 +173,7 @@ class DetectionModel(object): ...@@ -154,7 +173,7 @@ class DetectionModel(object):
pass pass
@abstractmethod @abstractmethod
def postprocess(self, prediction_dict, **params): def postprocess(self, prediction_dict, true_image_shapes, **params):
"""Convert predicted output tensors to final detections. """Convert predicted output tensors to final detections.
Outputs adhere to the following conventions: Outputs adhere to the following conventions:
...@@ -172,6 +191,10 @@ class DetectionModel(object): ...@@ -172,6 +191,10 @@ class DetectionModel(object):
Args: Args:
prediction_dict: a dictionary holding prediction tensors. prediction_dict: a dictionary holding prediction tensors.
true_image_shapes: int32 tensor of shape [batch, 3] where each row is
of the form [height, width, channels] indicating the shapes
of true images in the resized images, as resized images can be padded
with zeros.
**params: Additional keyword arguments for specific implementations of **params: Additional keyword arguments for specific implementations of
DetectionModel. DetectionModel.
...@@ -190,7 +213,7 @@ class DetectionModel(object): ...@@ -190,7 +213,7 @@ class DetectionModel(object):
pass pass
@abstractmethod @abstractmethod
def loss(self, prediction_dict): def loss(self, prediction_dict, true_image_shapes):
"""Compute scalar loss tensors with respect to provided groundtruth. """Compute scalar loss tensors with respect to provided groundtruth.
Calling this function requires that groundtruth tensors have been Calling this function requires that groundtruth tensors have been
...@@ -198,6 +221,10 @@ class DetectionModel(object): ...@@ -198,6 +221,10 @@ class DetectionModel(object):
Args: Args:
prediction_dict: a dictionary holding predicted tensors prediction_dict: a dictionary holding predicted tensors
true_image_shapes: int32 tensor of shape [batch, 3] where each row is
of the form [height, width, channels] indicating the shapes
of true images in the resized images, as resized images can be padded
with zeros.
Returns: Returns:
a dictionary mapping strings (loss names) to scalar tensors representing a dictionary mapping strings (loss names) to scalar tensors representing
......
...@@ -20,6 +20,7 @@ import tensorflow as tf ...@@ -20,6 +20,7 @@ import tensorflow as tf
from object_detection.core import box_list from object_detection.core import box_list
from object_detection.core import box_list_ops from object_detection.core import box_list_ops
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.utils import shape_utils
def multiclass_non_max_suppression(boxes, def multiclass_non_max_suppression(boxes,
...@@ -31,6 +32,7 @@ def multiclass_non_max_suppression(boxes, ...@@ -31,6 +32,7 @@ def multiclass_non_max_suppression(boxes,
clip_window=None, clip_window=None,
change_coordinate_frame=False, change_coordinate_frame=False,
masks=None, masks=None,
boundaries=None,
additional_fields=None, additional_fields=None,
scope=None): scope=None):
"""Multi-class version of non maximum suppression. """Multi-class version of non maximum suppression.
...@@ -66,6 +68,9 @@ def multiclass_non_max_suppression(boxes, ...@@ -66,6 +68,9 @@ def multiclass_non_max_suppression(boxes,
masks: (optional) a [k, q, mask_height, mask_width] float32 tensor masks: (optional) a [k, q, mask_height, mask_width] float32 tensor
containing box masks. `q` can be either number of classes or 1 depending containing box masks. `q` can be either number of classes or 1 depending
on whether a separate mask is predicted per class. on whether a separate mask is predicted per class.
boundaries: (optional) a [k, q, boundary_height, boundary_width] float32
tensor containing box boundaries. `q` can be either number of classes or 1
depending on whether a separate boundary is predicted per class.
additional_fields: (optional) If not None, a dictionary that maps keys to additional_fields: (optional) If not None, a dictionary that maps keys to
tensors whose first dimensions are all of size `k`. After non-maximum tensors whose first dimensions are all of size `k`. After non-maximum
suppression, all tensors corresponding to the selected boxes will be suppression, all tensors corresponding to the selected boxes will be
...@@ -114,6 +119,8 @@ def multiclass_non_max_suppression(boxes, ...@@ -114,6 +119,8 @@ def multiclass_non_max_suppression(boxes,
per_class_boxes_list = tf.unstack(boxes, axis=1) per_class_boxes_list = tf.unstack(boxes, axis=1)
if masks is not None: if masks is not None:
per_class_masks_list = tf.unstack(masks, axis=1) per_class_masks_list = tf.unstack(masks, axis=1)
if boundaries is not None:
per_class_boundaries_list = tf.unstack(boundaries, axis=1)
boxes_ids = (range(num_classes) if len(per_class_boxes_list) > 1 boxes_ids = (range(num_classes) if len(per_class_boxes_list) > 1
else [0] * num_classes) else [0] * num_classes)
for class_idx, boxes_idx in zip(range(num_classes), boxes_ids): for class_idx, boxes_idx in zip(range(num_classes), boxes_ids):
...@@ -128,6 +135,10 @@ def multiclass_non_max_suppression(boxes, ...@@ -128,6 +135,10 @@ def multiclass_non_max_suppression(boxes,
per_class_masks = per_class_masks_list[boxes_idx] per_class_masks = per_class_masks_list[boxes_idx]
boxlist_and_class_scores.add_field(fields.BoxListFields.masks, boxlist_and_class_scores.add_field(fields.BoxListFields.masks,
per_class_masks) per_class_masks)
if boundaries is not None:
per_class_boundaries = per_class_boundaries_list[boxes_idx]
boxlist_and_class_scores.add_field(fields.BoxListFields.boundaries,
per_class_boundaries)
if additional_fields is not None: if additional_fields is not None:
for key, tensor in additional_fields.items(): for key, tensor in additional_fields.items():
boxlist_and_class_scores.add_field(key, tensor) boxlist_and_class_scores.add_field(key, tensor)
...@@ -194,9 +205,12 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -194,9 +205,12 @@ def batch_multiclass_non_max_suppression(boxes,
max_size_per_class: maximum number of retained boxes per class. max_size_per_class: maximum number of retained boxes per class.
max_total_size: maximum number of boxes retained over all classes. By max_total_size: maximum number of boxes retained over all classes. By
default returns all boxes retained after capping boxes per class. default returns all boxes retained after capping boxes per class.
clip_window: A float32 tensor of the form [y_min, x_min, y_max, x_max] clip_window: A float32 tensor of shape [batch_size, 4] where each entry is
representing the window to clip boxes to before performing non-max of the form [y_min, x_min, y_max, x_max] representing the window to clip
suppression. boxes to before performing non-max suppression. This argument can also be
a tensor of shape [4] in which case, the same clip window is applied to
all images in the batch. If clip_widow is None, all boxes are used to
perform non-max suppression.
change_coordinate_frame: Whether to normalize coordinates after clipping change_coordinate_frame: Whether to normalize coordinates after clipping
relative to clip_window (this can only be set to True if a clip_window relative to clip_window (this can only be set to True if a clip_window
is provided) is provided)
...@@ -242,7 +256,9 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -242,7 +256,9 @@ def batch_multiclass_non_max_suppression(boxes,
if q != 1 and q != num_classes: if q != 1 and q != num_classes:
raise ValueError('third dimension of boxes must be either 1 or equal ' raise ValueError('third dimension of boxes must be either 1 or equal '
'to the third dimension of scores') 'to the third dimension of scores')
if change_coordinate_frame and clip_window is None:
raise ValueError('if change_coordinate_frame is True, then a clip_window'
'must be specified.')
original_masks = masks original_masks = masks
original_additional_fields = additional_fields original_additional_fields = additional_fields
with tf.name_scope(scope, 'BatchMultiClassNonMaxSuppression'): with tf.name_scope(scope, 'BatchMultiClassNonMaxSuppression'):
...@@ -266,6 +282,16 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -266,6 +282,16 @@ def batch_multiclass_non_max_suppression(boxes,
masks_shape = tf.stack([batch_size, num_anchors, 1, 0, 0]) masks_shape = tf.stack([batch_size, num_anchors, 1, 0, 0])
masks = tf.zeros(masks_shape) masks = tf.zeros(masks_shape)
if clip_window is None:
clip_window = tf.stack([
tf.reduce_min(boxes[:, :, :, 0]),
tf.reduce_min(boxes[:, :, :, 1]),
tf.reduce_max(boxes[:, :, :, 2]),
tf.reduce_max(boxes[:, :, :, 3])
])
if clip_window.shape.ndims == 1:
clip_window = tf.tile(tf.expand_dims(clip_window, 0), [batch_size, 1])
if additional_fields is None: if additional_fields is None:
additional_fields = {} additional_fields = {}
...@@ -283,6 +309,9 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -283,6 +309,9 @@ def batch_multiclass_non_max_suppression(boxes,
per_image_masks - A [num_anchors, q, mask_height, mask_width] float32 per_image_masks - A [num_anchors, q, mask_height, mask_width] float32
tensor containing box masks. `q` can be either number of classes tensor containing box masks. `q` can be either number of classes
or 1 depending on whether a separate mask is predicted per class. or 1 depending on whether a separate mask is predicted per class.
per_image_clip_window - A 1D float32 tensor of the form
[ymin, xmin, ymax, xmax] representing the window to clip the boxes
to.
per_image_additional_fields - (optional) A variable number of float32 per_image_additional_fields - (optional) A variable number of float32
tensors each with size [num_anchors, ...]. tensors each with size [num_anchors, ...].
per_image_num_valid_boxes - A tensor of type `int32`. A 1-D tensor of per_image_num_valid_boxes - A tensor of type `int32`. A 1-D tensor of
...@@ -311,9 +340,10 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -311,9 +340,10 @@ def batch_multiclass_non_max_suppression(boxes,
per_image_boxes = args[0] per_image_boxes = args[0]
per_image_scores = args[1] per_image_scores = args[1]
per_image_masks = args[2] per_image_masks = args[2]
per_image_clip_window = args[3]
per_image_additional_fields = { per_image_additional_fields = {
key: value key: value
for key, value in zip(additional_fields, args[3:-1]) for key, value in zip(additional_fields, args[4:-1])
} }
per_image_num_valid_boxes = args[-1] per_image_num_valid_boxes = args[-1]
per_image_boxes = tf.reshape( per_image_boxes = tf.reshape(
...@@ -345,7 +375,7 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -345,7 +375,7 @@ def batch_multiclass_non_max_suppression(boxes,
iou_thresh, iou_thresh,
max_size_per_class, max_size_per_class,
max_total_size, max_total_size,
clip_window=clip_window, clip_window=per_image_clip_window,
change_coordinate_frame=change_coordinate_frame, change_coordinate_frame=change_coordinate_frame,
masks=per_image_masks, masks=per_image_masks,
additional_fields=per_image_additional_fields) additional_fields=per_image_additional_fields)
...@@ -367,10 +397,10 @@ def batch_multiclass_non_max_suppression(boxes, ...@@ -367,10 +397,10 @@ def batch_multiclass_non_max_suppression(boxes,
num_additional_fields = len(additional_fields) num_additional_fields = len(additional_fields)
num_nmsed_outputs = 4 + num_additional_fields num_nmsed_outputs = 4 + num_additional_fields
batch_outputs = tf.map_fn( batch_outputs = shape_utils.static_or_dynamic_map_fn(
_single_image_nms_fn, _single_image_nms_fn,
elems=([boxes, scores, masks] + list(additional_fields.values()) + elems=([boxes, scores, masks, clip_window] +
[num_valid_boxes]), list(additional_fields.values()) + [num_valid_boxes]),
dtype=(num_nmsed_outputs * [tf.float32] + [tf.int32]), dtype=(num_nmsed_outputs * [tf.float32] + [tf.int32]),
parallel_iterations=parallel_iterations) parallel_iterations=parallel_iterations)
......
...@@ -571,6 +571,125 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase): ...@@ -571,6 +571,125 @@ class MulticlassNonMaxSuppressionTest(tf.test.TestCase):
self.assertAllClose(nmsed_classes, exp_nms_classes) self.assertAllClose(nmsed_classes, exp_nms_classes)
self.assertAllClose(num_detections, [2, 3]) self.assertAllClose(num_detections, [2, 3])
def test_batch_multiclass_nms_with_per_batch_clip_window(self):
boxes = tf.constant([[[[0, 0, 1, 1], [0, 0, 4, 5]],
[[0, 0.1, 1, 1.1], [0, 0.1, 2, 1.1]],
[[0, -0.1, 1, 0.9], [0, -0.1, 1, 0.9]],
[[0, 10, 1, 11], [0, 10, 1, 11]]],
[[[0, 10.1, 1, 11.1], [0, 10.1, 1, 11.1]],
[[0, 100, 1, 101], [0, 100, 1, 101]],
[[0, 1000, 1, 1002], [0, 999, 2, 1004]],
[[0, 1000, 1, 1002.1], [0, 999, 2, 1002.7]]]],
tf.float32)
scores = tf.constant([[[.9, 0.01], [.75, 0.05],
[.6, 0.01], [.95, 0]],
[[.5, 0.01], [.3, 0.01],
[.01, .85], [.01, .5]]])
clip_window = tf.constant([0., 0., 200., 200.])
score_thresh = 0.1
iou_thresh = .5
max_output_size = 4
exp_nms_corners = np.array([[[0, 10, 1, 11],
[0, 0, 1, 1],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 10.1, 1, 11.1],
[0, 100, 1, 101],
[0, 0, 0, 0],
[0, 0, 0, 0]]])
exp_nms_scores = np.array([[.95, .9, 0, 0],
[.5, .3, 0, 0]])
exp_nms_classes = np.array([[0, 0, 0, 0],
[0, 0, 0, 0]])
(nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
nmsed_additional_fields, num_detections
) = post_processing.batch_multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh,
max_size_per_class=max_output_size, max_total_size=max_output_size,
clip_window=clip_window)
self.assertIsNone(nmsed_masks)
self.assertIsNone(nmsed_additional_fields)
# Check static shapes
self.assertAllEqual(nmsed_boxes.shape.as_list(),
exp_nms_corners.shape)
self.assertAllEqual(nmsed_scores.shape.as_list(),
exp_nms_scores.shape)
self.assertAllEqual(nmsed_classes.shape.as_list(),
exp_nms_classes.shape)
self.assertEqual(num_detections.shape.as_list(), [2])
with self.test_session() as sess:
(nmsed_boxes, nmsed_scores, nmsed_classes,
num_detections) = sess.run([nmsed_boxes, nmsed_scores, nmsed_classes,
num_detections])
self.assertAllClose(nmsed_boxes, exp_nms_corners)
self.assertAllClose(nmsed_scores, exp_nms_scores)
self.assertAllClose(nmsed_classes, exp_nms_classes)
self.assertAllClose(num_detections, [2, 2])
def test_batch_multiclass_nms_with_per_image_clip_window(self):
boxes = tf.constant([[[[0, 0, 1, 1], [0, 0, 4, 5]],
[[0, 0.1, 1, 1.1], [0, 0.1, 2, 1.1]],
[[0, -0.1, 1, 0.9], [0, -0.1, 1, 0.9]],
[[0, 10, 1, 11], [0, 10, 1, 11]]],
[[[0, 10.1, 1, 11.1], [0, 10.1, 1, 11.1]],
[[0, 100, 1, 101], [0, 100, 1, 101]],
[[0, 1000, 1, 1002], [0, 999, 2, 1004]],
[[0, 1000, 1, 1002.1], [0, 999, 2, 1002.7]]]],
tf.float32)
scores = tf.constant([[[.9, 0.01], [.75, 0.05],
[.6, 0.01], [.95, 0]],
[[.5, 0.01], [.3, 0.01],
[.01, .85], [.01, .5]]])
clip_window = tf.constant([[0., 0., 5., 5.],
[0., 0., 200., 200.]])
score_thresh = 0.1
iou_thresh = .5
max_output_size = 4
exp_nms_corners = np.array([[[0, 0, 1, 1],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 10.1, 1, 11.1],
[0, 100, 1, 101],
[0, 0, 0, 0],
[0, 0, 0, 0]]])
exp_nms_scores = np.array([[.9, 0., 0., 0.],
[.5, .3, 0, 0]])
exp_nms_classes = np.array([[0, 0, 0, 0],
[0, 0, 0, 0]])
(nmsed_boxes, nmsed_scores, nmsed_classes, nmsed_masks,
nmsed_additional_fields, num_detections
) = post_processing.batch_multiclass_non_max_suppression(
boxes, scores, score_thresh, iou_thresh,
max_size_per_class=max_output_size, max_total_size=max_output_size,
clip_window=clip_window)
self.assertIsNone(nmsed_masks)
self.assertIsNone(nmsed_additional_fields)
# Check static shapes
self.assertAllEqual(nmsed_boxes.shape.as_list(),
exp_nms_corners.shape)
self.assertAllEqual(nmsed_scores.shape.as_list(),
exp_nms_scores.shape)
self.assertAllEqual(nmsed_classes.shape.as_list(),
exp_nms_classes.shape)
self.assertEqual(num_detections.shape.as_list(), [2])
with self.test_session() as sess:
(nmsed_boxes, nmsed_scores, nmsed_classes,
num_detections) = sess.run([nmsed_boxes, nmsed_scores, nmsed_classes,
num_detections])
self.assertAllClose(nmsed_boxes, exp_nms_corners)
self.assertAllClose(nmsed_scores, exp_nms_scores)
self.assertAllClose(nmsed_classes, exp_nms_classes)
self.assertAllClose(num_detections, [1, 2])
def test_batch_multiclass_nms_with_masks(self): def test_batch_multiclass_nms_with_masks(self):
boxes = tf.constant([[[[0, 0, 1, 1], [0, 0, 4, 5]], boxes = tf.constant([[[[0, 0, 1, 1], [0, 0, 4, 5]],
[[0, 0.1, 1, 1.1], [0, 0.1, 2, 1.1]], [[0, 0.1, 1, 1.1], [0, 0.1, 2, 1.1]],
......
...@@ -51,6 +51,7 @@ from object_detection.core import box_list ...@@ -51,6 +51,7 @@ from object_detection.core import box_list
from object_detection.core import box_list_ops from object_detection.core import box_list_ops
from object_detection.core import keypoint_ops from object_detection.core import keypoint_ops
from object_detection.core import standard_fields as fields from object_detection.core import standard_fields as fields
from object_detection.utils import shape_utils
def _apply_with_random_selector(x, func, num_cases): def _apply_with_random_selector(x, func, num_cases):
...@@ -1647,6 +1648,7 @@ def _compute_new_static_size(image, min_dimension, max_dimension): ...@@ -1647,6 +1648,7 @@ def _compute_new_static_size(image, min_dimension, max_dimension):
image_shape = image.get_shape().as_list() image_shape = image.get_shape().as_list()
orig_height = image_shape[0] orig_height = image_shape[0]
orig_width = image_shape[1] orig_width = image_shape[1]
num_channels = image_shape[2]
orig_min_dim = min(orig_height, orig_width) orig_min_dim = min(orig_height, orig_width)
# Calculates the larger of the possible sizes # Calculates the larger of the possible sizes
large_scale_factor = min_dimension / float(orig_min_dim) large_scale_factor = min_dimension / float(orig_min_dim)
...@@ -1674,7 +1676,7 @@ def _compute_new_static_size(image, min_dimension, max_dimension): ...@@ -1674,7 +1676,7 @@ def _compute_new_static_size(image, min_dimension, max_dimension):
new_size = small_size new_size = small_size
else: else:
new_size = large_size new_size = large_size
return tf.constant(new_size) return tf.constant(new_size + [num_channels])
def _compute_new_dynamic_size(image, min_dimension, max_dimension): def _compute_new_dynamic_size(image, min_dimension, max_dimension):
...@@ -1682,6 +1684,7 @@ def _compute_new_dynamic_size(image, min_dimension, max_dimension): ...@@ -1682,6 +1684,7 @@ def _compute_new_dynamic_size(image, min_dimension, max_dimension):
image_shape = tf.shape(image) image_shape = tf.shape(image)
orig_height = tf.to_float(image_shape[0]) orig_height = tf.to_float(image_shape[0])
orig_width = tf.to_float(image_shape[1]) orig_width = tf.to_float(image_shape[1])
num_channels = image_shape[2]
orig_min_dim = tf.minimum(orig_height, orig_width) orig_min_dim = tf.minimum(orig_height, orig_width)
# Calculates the larger of the possible sizes # Calculates the larger of the possible sizes
min_dimension = tf.constant(min_dimension, dtype=tf.float32) min_dimension = tf.constant(min_dimension, dtype=tf.float32)
...@@ -1711,7 +1714,7 @@ def _compute_new_dynamic_size(image, min_dimension, max_dimension): ...@@ -1711,7 +1714,7 @@ def _compute_new_dynamic_size(image, min_dimension, max_dimension):
lambda: small_size, lambda: large_size) lambda: small_size, lambda: large_size)
else: else:
new_size = large_size new_size = large_size
return new_size return tf.stack(tf.unstack(new_size) + [num_channels])
def resize_to_range(image, def resize_to_range(image,
...@@ -1719,7 +1722,8 @@ def resize_to_range(image, ...@@ -1719,7 +1722,8 @@ def resize_to_range(image,
min_dimension=None, min_dimension=None,
max_dimension=None, max_dimension=None,
method=tf.image.ResizeMethod.BILINEAR, method=tf.image.ResizeMethod.BILINEAR,
align_corners=False): align_corners=False,
pad_to_max_dimension=False):
"""Resizes an image so its dimensions are within the provided value. """Resizes an image so its dimensions are within the provided value.
The output size can be described by two cases: The output size can be described by two cases:
...@@ -1740,15 +1744,22 @@ def resize_to_range(image, ...@@ -1740,15 +1744,22 @@ def resize_to_range(image,
BILINEAR. BILINEAR.
align_corners: bool. If true, exactly align all 4 corners of the input align_corners: bool. If true, exactly align all 4 corners of the input
and output. Defaults to False. and output. Defaults to False.
pad_to_max_dimension: Whether to resize the image and pad it with zeros
so the resulting image is of the spatial size
[max_dimension, max_dimension]. If masks are included they are padded
similarly.
Returns: Returns:
A 3D tensor of shape [new_height, new_width, channels], Note that the position of the resized_image_shape changes based on whether
where the image has been resized (with bilinear interpolation) so that masks are present.
min(new_height, new_width) == min_dimension or resized_image: A 3D tensor of shape [new_height, new_width, channels],
max(new_height, new_width) == max_dimension. where the image has been resized (with bilinear interpolation) so that
min(new_height, new_width) == min_dimension or
If masks is not None, also outputs masks: max(new_height, new_width) == max_dimension.
A 3D tensor of shape [num_instances, new_height, new_width] resized_masks: If masks is not None, also outputs masks. A 3D tensor of
shape [num_instances, new_height, new_width].
resized_image_shape: A 1D tensor of shape [3] containing shape of the
resized image.
Raises: Raises:
ValueError: if the image is not a 3D tensor. ValueError: if the image is not a 3D tensor.
...@@ -1762,16 +1773,27 @@ def resize_to_range(image, ...@@ -1762,16 +1773,27 @@ def resize_to_range(image,
else: else:
new_size = _compute_new_dynamic_size(image, min_dimension, max_dimension) new_size = _compute_new_dynamic_size(image, min_dimension, max_dimension)
new_image = tf.image.resize_images( new_image = tf.image.resize_images(
image, new_size, method=method, align_corners=align_corners) image, new_size[:-1], method=method, align_corners=align_corners)
if pad_to_max_dimension:
new_image = tf.image.pad_to_bounding_box(
new_image, 0, 0, max_dimension, max_dimension)
result = new_image result = [new_image]
if masks is not None: if masks is not None:
new_masks = tf.expand_dims(masks, 3) new_masks = tf.expand_dims(masks, 3)
new_masks = tf.image.resize_nearest_neighbor( new_masks = tf.image.resize_images(
new_masks, new_size, align_corners=align_corners) new_masks,
new_size[:-1],
method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,
align_corners=align_corners)
new_masks = tf.squeeze(new_masks, 3) new_masks = tf.squeeze(new_masks, 3)
result = [new_image, new_masks] if pad_to_max_dimension:
new_masks = tf.image.pad_to_bounding_box(
new_masks, 0, 0, max_dimension, max_dimension)
result.append(new_masks)
result.append(new_size)
return result return result
...@@ -1789,10 +1811,13 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600): ...@@ -1789,10 +1811,13 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600):
min_dimension: minimum image dimension. min_dimension: minimum image dimension.
Returns: Returns:
a tuple containing the following: Note that the position of the resized_image_shape changes based on whether
Resized image. A tensor of size [new_height, new_width, channels]. masks are present.
(optional) Resized masks. A tensor of resized_image: A tensor of size [new_height, new_width, channels].
size [num_instances, new_height, new_width]. resized_masks: If masks is not None, also outputs masks. A 3D tensor of
shape [num_instances, new_height, new_width]
resized_image_shape: A 1D tensor of shape [3] containing the shape of the
resized image.
Raises: Raises:
ValueError: if the image is not a 3D tensor. ValueError: if the image is not a 3D tensor.
...@@ -1803,6 +1828,7 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600): ...@@ -1803,6 +1828,7 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600):
with tf.name_scope('ResizeGivenMinDimension', values=[image, min_dimension]): with tf.name_scope('ResizeGivenMinDimension', values=[image, min_dimension]):
image_height = tf.shape(image)[0] image_height = tf.shape(image)[0]
image_width = tf.shape(image)[1] image_width = tf.shape(image)[1]
num_channels = tf.shape(image)[2]
min_image_dimension = tf.minimum(image_height, image_width) min_image_dimension = tf.minimum(image_height, image_width)
min_target_dimension = tf.maximum(min_image_dimension, min_dimension) min_target_dimension = tf.maximum(min_image_dimension, min_dimension)
target_ratio = tf.to_float(min_target_dimension) / tf.to_float( target_ratio = tf.to_float(min_target_dimension) / tf.to_float(
...@@ -1813,13 +1839,16 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600): ...@@ -1813,13 +1839,16 @@ def resize_to_min_dimension(image, masks=None, min_dimension=600):
tf.expand_dims(image, axis=0), tf.expand_dims(image, axis=0),
size=[target_height, target_width], size=[target_height, target_width],
align_corners=True) align_corners=True)
result = tf.squeeze(image, axis=0) result = [tf.squeeze(image, axis=0)]
if masks is not None: if masks is not None:
masks = tf.image.resize_nearest_neighbor( masks = tf.image.resize_nearest_neighbor(
tf.expand_dims(masks, axis=3), tf.expand_dims(masks, axis=3),
size=[target_height, target_width], size=[target_height, target_width],
align_corners=True) align_corners=True)
result = (result, tf.squeeze(masks, axis=3)) result.append(tf.squeeze(masks, axis=3))
result.append(tf.stack([target_height, target_width, num_channels]))
return result return result
...@@ -1854,6 +1883,8 @@ def scale_boxes_to_pixel_coordinates(image, boxes, keypoints=None): ...@@ -1854,6 +1883,8 @@ def scale_boxes_to_pixel_coordinates(image, boxes, keypoints=None):
return tuple(result) return tuple(result)
# TODO: Investigate if instead the function should return None if
# masks is None.
# pylint: disable=g-doc-return-or-yield # pylint: disable=g-doc-return-or-yield
def resize_image(image, def resize_image(image,
masks=None, masks=None,
...@@ -1861,7 +1892,28 @@ def resize_image(image, ...@@ -1861,7 +1892,28 @@ def resize_image(image,
new_width=1024, new_width=1024,
method=tf.image.ResizeMethod.BILINEAR, method=tf.image.ResizeMethod.BILINEAR,
align_corners=False): align_corners=False):
"""See `tf.image.resize_images` for detailed doc.""" """Resizes images to the given height and width.
Args:
image: A 3D tensor of shape [height, width, channels]
masks: (optional) rank 3 float32 tensor with shape
[num_instances, height, width] containing instance masks.
new_height: (optional) (scalar) desired height of the image.
new_width: (optional) (scalar) desired width of the image.
method: (optional) interpolation method used in resizing. Defaults to
BILINEAR.
align_corners: bool. If true, exactly align all 4 corners of the input
and output. Defaults to False.
Returns:
Note that the position of the resized_image_shape changes based on whether
masks are present.
resized_image: A tensor of size [new_height, new_width, channels].
resized_masks: If masks is not None, also outputs masks. A 3D tensor of
shape [num_instances, new_height, new_width]
resized_image_shape: A 1D tensor of shape [3] containing the shape of the
resized image.
"""
with tf.name_scope( with tf.name_scope(
'ResizeImage', 'ResizeImage',
values=[image, new_height, new_width, method, align_corners]): values=[image, new_height, new_width, method, align_corners]):
...@@ -1869,7 +1921,8 @@ def resize_image(image, ...@@ -1869,7 +1921,8 @@ def resize_image(image,
image, [new_height, new_width], image, [new_height, new_width],
method=method, method=method,
align_corners=align_corners) align_corners=align_corners)
result = new_image image_shape = shape_utils.combined_static_and_dynamic_shape(image)
result = [new_image]
if masks is not None: if masks is not None:
num_instances = tf.shape(masks)[0] num_instances = tf.shape(masks)[0]
new_size = tf.constant([new_height, new_width], dtype=tf.int32) new_size = tf.constant([new_height, new_width], dtype=tf.int32)
...@@ -1886,8 +1939,9 @@ def resize_image(image, ...@@ -1886,8 +1939,9 @@ def resize_image(image,
masks = tf.cond(num_instances > 0, resize_masks_branch, masks = tf.cond(num_instances > 0, resize_masks_branch,
reshape_masks_branch) reshape_masks_branch)
result = [new_image, masks] result.append(masks)
result.append(tf.stack([new_height, new_width, image_shape[2]]))
return result return result
......
...@@ -1853,7 +1853,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1853,7 +1853,7 @@ class PreprocessorTest(tf.test.TestCase):
expected_masks_shape_list): expected_masks_shape_list):
in_image = tf.random_uniform(in_image_shape) in_image = tf.random_uniform(in_image_shape)
in_masks = tf.random_uniform(in_masks_shape) in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks = preprocessor.resize_image( out_image, out_masks, _ = preprocessor.resize_image(
in_image, in_masks, new_height=height, new_width=width) in_image, in_masks, new_height=height, new_width=width)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks) out_masks_shape = tf.shape(out_masks)
...@@ -1880,7 +1880,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1880,7 +1880,7 @@ class PreprocessorTest(tf.test.TestCase):
expected_masks_shape_list): expected_masks_shape_list):
in_image = tf.random_uniform(in_image_shape) in_image = tf.random_uniform(in_image_shape)
in_masks = tf.random_uniform(in_masks_shape) in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks = preprocessor.resize_image( out_image, out_masks, _ = preprocessor.resize_image(
in_image, in_masks, new_height=height, new_width=width) in_image, in_masks, new_height=height, new_width=width)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks) out_masks_shape = tf.shape(out_masks)
...@@ -1900,7 +1900,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1900,7 +1900,7 @@ class PreprocessorTest(tf.test.TestCase):
for in_shape, expected_shape in zip(in_shape_list, expected_shape_list): for in_shape, expected_shape in zip(in_shape_list, expected_shape_list):
in_image = tf.random_uniform(in_shape) in_image = tf.random_uniform(in_shape)
out_image = preprocessor.resize_to_range( out_image, _ = preprocessor.resize_to_range(
in_image, min_dimension=min_dim, max_dimension=max_dim) in_image, min_dimension=min_dim, max_dimension=max_dim)
self.assertAllEqual(out_image.get_shape().as_list(), expected_shape) self.assertAllEqual(out_image.get_shape().as_list(), expected_shape)
...@@ -1913,7 +1913,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1913,7 +1913,7 @@ class PreprocessorTest(tf.test.TestCase):
for in_shape, expected_shape in zip(in_shape_list, expected_shape_list): for in_shape, expected_shape in zip(in_shape_list, expected_shape_list):
in_image = tf.placeholder(tf.float32, shape=(None, None, 3)) in_image = tf.placeholder(tf.float32, shape=(None, None, 3))
out_image = preprocessor.resize_to_range( out_image, _ = preprocessor.resize_to_range(
in_image, min_dimension=min_dim, max_dimension=max_dim) in_image, min_dimension=min_dim, max_dimension=max_dim)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
with self.test_session() as sess: with self.test_session() as sess:
...@@ -1938,7 +1938,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1938,7 +1938,7 @@ class PreprocessorTest(tf.test.TestCase):
expected_masks_shape_list): expected_masks_shape_list):
in_image = tf.random_uniform(in_image_shape) in_image = tf.random_uniform(in_image_shape)
in_masks = tf.random_uniform(in_masks_shape) in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks = preprocessor.resize_to_range( out_image, out_masks, _ = preprocessor.resize_to_range(
in_image, in_masks, min_dimension=min_dim, max_dimension=max_dim) in_image, in_masks, min_dimension=min_dim, max_dimension=max_dim)
self.assertAllEqual(out_masks.get_shape().as_list(), expected_mask_shape) self.assertAllEqual(out_masks.get_shape().as_list(), expected_mask_shape)
self.assertAllEqual(out_image.get_shape().as_list(), expected_image_shape) self.assertAllEqual(out_image.get_shape().as_list(), expected_image_shape)
...@@ -1960,7 +1960,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1960,7 +1960,7 @@ class PreprocessorTest(tf.test.TestCase):
in_image = tf.placeholder(tf.float32, shape=(None, None, 3)) in_image = tf.placeholder(tf.float32, shape=(None, None, 3))
in_masks = tf.placeholder(tf.float32, shape=(None, None, None)) in_masks = tf.placeholder(tf.float32, shape=(None, None, None))
in_masks = tf.random_uniform(in_masks_shape) in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks = preprocessor.resize_to_range( out_image, out_masks, _ = preprocessor.resize_to_range(
in_image, in_masks, min_dimension=min_dim, max_dimension=max_dim) in_image, in_masks, min_dimension=min_dim, max_dimension=max_dim)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks) out_masks_shape = tf.shape(out_masks)
...@@ -1991,7 +1991,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -1991,7 +1991,7 @@ class PreprocessorTest(tf.test.TestCase):
expected_masks_shape_list): expected_masks_shape_list):
in_image = tf.random_uniform(in_image_shape) in_image = tf.random_uniform(in_image_shape)
in_masks = tf.random_uniform(in_masks_shape) in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks = preprocessor.resize_to_range( out_image, out_masks, _ = preprocessor.resize_to_range(
in_image, in_masks, min_dimension=min_dim, max_dimension=max_dim) in_image, in_masks, min_dimension=min_dim, max_dimension=max_dim)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks) out_masks_shape = tf.shape(out_masks)
...@@ -2016,7 +2016,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -2016,7 +2016,7 @@ class PreprocessorTest(tf.test.TestCase):
for in_shape, expected_shape in zip(in_shape_list, expected_shape_list): for in_shape, expected_shape in zip(in_shape_list, expected_shape_list):
in_image = tf.random_uniform(in_shape) in_image = tf.random_uniform(in_shape)
out_image = preprocessor.resize_to_range( out_image, _ = preprocessor.resize_to_range(
in_image, min_dimension=min_dim, max_dimension=max_dim) in_image, min_dimension=min_dim, max_dimension=max_dim)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
...@@ -2039,7 +2039,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -2039,7 +2039,7 @@ class PreprocessorTest(tf.test.TestCase):
in_image = tf.placeholder(tf.float32, shape=(None, None, 3)) in_image = tf.placeholder(tf.float32, shape=(None, None, 3))
in_masks = tf.placeholder(tf.float32, shape=(None, None, None)) in_masks = tf.placeholder(tf.float32, shape=(None, None, None))
in_masks = tf.random_uniform(in_masks_shape) in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks = preprocessor.resize_to_min_dimension( out_image, out_masks, _ = preprocessor.resize_to_min_dimension(
in_image, in_masks, min_dimension=min_dim) in_image, in_masks, min_dimension=min_dim)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks) out_masks_shape = tf.shape(out_masks)
...@@ -2069,7 +2069,7 @@ class PreprocessorTest(tf.test.TestCase): ...@@ -2069,7 +2069,7 @@ class PreprocessorTest(tf.test.TestCase):
expected_masks_shape_list): expected_masks_shape_list):
in_image = tf.random_uniform(in_image_shape) in_image = tf.random_uniform(in_image_shape)
in_masks = tf.random_uniform(in_masks_shape) in_masks = tf.random_uniform(in_masks_shape)
out_image, out_masks = preprocessor.resize_to_min_dimension( out_image, out_masks, _ = preprocessor.resize_to_min_dimension(
in_image, in_masks, min_dimension=min_dim) in_image, in_masks, min_dimension=min_dim)
out_image_shape = tf.shape(out_image) out_image_shape = tf.shape(out_image)
out_masks_shape = tf.shape(out_masks) out_masks_shape = tf.shape(out_masks)
......
...@@ -57,6 +57,7 @@ class InputDataFields(object): ...@@ -57,6 +57,7 @@ class InputDataFields(object):
groundtruth_keypoints: ground truth keypoints. groundtruth_keypoints: ground truth keypoints.
groundtruth_keypoint_visibilities: ground truth keypoint visibilities. groundtruth_keypoint_visibilities: ground truth keypoint visibilities.
groundtruth_label_scores: groundtruth label scores. groundtruth_label_scores: groundtruth label scores.
groundtruth_weights: groundtruth weight factor for bounding boxes.
""" """
image = 'image' image = 'image'
original_image = 'original_image' original_image = 'original_image'
...@@ -79,10 +80,11 @@ class InputDataFields(object): ...@@ -79,10 +80,11 @@ class InputDataFields(object):
groundtruth_keypoints = 'groundtruth_keypoints' groundtruth_keypoints = 'groundtruth_keypoints'
groundtruth_keypoint_visibilities = 'groundtruth_keypoint_visibilities' groundtruth_keypoint_visibilities = 'groundtruth_keypoint_visibilities'
groundtruth_label_scores = 'groundtruth_label_scores' groundtruth_label_scores = 'groundtruth_label_scores'
groundtruth_weights = 'groundtruth_weights'
class DetectionResultFields(object): class DetectionResultFields(object):
"""Naming converntions for storing the output of the detector. """Naming conventions for storing the output of the detector.
Attributes: Attributes:
source_id: source of the original image. source_id: source of the original image.
...@@ -162,6 +164,7 @@ class TfExampleFields(object): ...@@ -162,6 +164,7 @@ class TfExampleFields(object):
object_is_crowd: [DEPRECATED, use object_group_of instead] object_is_crowd: [DEPRECATED, use object_group_of instead]
is the object a single object or a crowd is the object a single object or a crowd
object_segment_area: the area of the segment. object_segment_area: the area of the segment.
object_weight: a weight factor for the object's bounding box.
instance_masks: instance segmentation masks. instance_masks: instance segmentation masks.
instance_boundaries: instance boundaries. instance_boundaries: instance boundaries.
instance_classes: Classes for each instance segmentation mask. instance_classes: Classes for each instance segmentation mask.
...@@ -194,6 +197,7 @@ class TfExampleFields(object): ...@@ -194,6 +197,7 @@ class TfExampleFields(object):
object_depiction = 'image/object/depiction' object_depiction = 'image/object/depiction'
object_is_crowd = 'image/object/is_crowd' object_is_crowd = 'image/object/is_crowd'
object_segment_area = 'image/object/segment/area' object_segment_area = 'image/object/segment/area'
object_weight = 'image/object/weight'
instance_masks = 'image/segmentation/object' instance_masks = 'image/segmentation/object'
instance_boundaries = 'image/boundaries/object' instance_boundaries = 'image/boundaries/object'
instance_classes = 'image/segmentation/object/class' instance_classes = 'image/segmentation/object/class'
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment