1. 13 Aug, 2021 1 commit
    • Valentin Andrei's avatar
      Reduce number of parameter groups to make optimizer more efficient · 737d099b
      Valentin Andrei authored
      Summary:
      `torch.optim._multi_tensor` provides faster Optimizer implementations as it uses foreach APIs. We can enable it by modifying from `OPTIMIZER: "ADAMW"` to `OPTIMIZER: "ADAMW_MT"` in the config file.
      
      In order to profit from the speedup, we need to reduce the number of parameter groups as suggested in this post: https://fb.workplace.com/groups/1405155842844877/permalink/4971600462867046/
      
      The current implementation uses one parameter group per parameter which is not optimal. The proposed change groups parameters by learning rate and weight decay combinations.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D30272112
      
      fbshipit-source-id: d8d24298a59b52c2fc2930f7d614a0c6380a432f
      737d099b
  2. 11 Aug, 2021 3 commits
  3. 06 Aug, 2021 2 commits
  4. 05 Aug, 2021 2 commits
    • Abduallah Mohamed's avatar
      Clarifying the use of do_test function · 610d2d03
      Abduallah Mohamed authored
      Summary: The `do_test` method might be used to perform testing outside the training process. One might think it will load the weights of the models before testing similar to `do_train` method. This diff adds a comment that clarifies this confusion.
      
      Reviewed By: ppwwyyxx
      
      Differential Revision: D29082338
      
      fbshipit-source-id: 6ec7d7f7f243503414fa904f4eb8856e62e9ed6d
      610d2d03
    • Yuxin Wu's avatar
      avoid warnings of NCCL · 30d5ca55
      Yuxin Wu authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/detectron2/pull/3322
      
      avoid warnings like the following:
      ```
      [W ProcessGroupNCCL.cpp:1569] Rank 0 using best-guess GPU 0 to perform barrier as devices used by
      this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is
      incorrect. Specify device_ids in barrier() to force use of a particular device.
      ```
      
      maybe can fix the hang in https://github.com/facebookresearch/detectron2/issues/3319
      
      Reviewed By: vaibhava0
      
      Differential Revision: D30077957
      
      fbshipit-source-id: b8827e66c5eecc06b650acde2e7ff44106327f69
      30d5ca55
  5. 04 Aug, 2021 1 commit
  6. 03 Aug, 2021 3 commits
  7. 01 Aug, 2021 1 commit
    • Zhicheng Yan's avatar
      stabilize deformable DETR training · a4f06b88
      Zhicheng Yan authored
      Summary:
      Deformable DETR training can be unstable due to iterative box refinement in the transformer decoder. To stabilize the training, introduce two changes
      - Remove the unnecessary use of inverse sigmoid.
      It is possible to completely avoid using inverse sigmoid when box refinement is turned on.
      - In `DeformableTransformer` class, detach `init_reference_out` before passing it into decoder to update memory and computer per-decoder-layer reference points/
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29903599
      
      fbshipit-source-id: a374ba161be0d7bcdfb42553044c4c6700e92623
      a4f06b88
  8. 29 Jul, 2021 1 commit
  9. 21 Jul, 2021 1 commit
    • Xi Yin's avatar
      fix bug in valid_bbox check · b4d9aad9
      Xi Yin authored
      Summary: In case the height/width is None, the original version will cause a crash. So adding additional check to bypass this issue.
      
      Reviewed By: ppwwyyxx
      
      Differential Revision: D29807853
      
      fbshipit-source-id: b2b1a7edb52b7911da79a11329d4cf93f343c279
      b4d9aad9
  10. 14 Jul, 2021 1 commit
  11. 09 Jul, 2021 2 commits
    • Mircea Cimpoi's avatar
      Add tests for exporter / boltnn export via torch delegate · d0c38c43
      Mircea Cimpoi authored
      Summary:
      Adding test for previous diff.
      Boltnn backend is supported on device -- so this test only checks if the conversion takes place and the output file is present.
      
      Differential Revision: D29589245
      
      fbshipit-source-id: ba66a733295304531d177086ce6459a50cfbaa07
      d0c38c43
    • Mircea Cimpoi's avatar
      Add BoltNN conversion to d2go exporter · ecf832da
      Mircea Cimpoi authored
      Summary:
      Added predictor_type `boltnn_int8` to export to BoltNN via torch delegate.
      
      - `int8` needs to be in the name, otherwise the post-train quantization won't happen;
      
      ```
      cfg.QUANTIZATION.BACKEND = "qnnpack"
      // cfg.QUANTIZATION.CUSTOM_QSCHEME = "per_tensor_affine"
      ```
      
      Seems that ` QUANTIZATION.CUSTOM_QSCHEME per_tensor_affine` is not needed - likely covered by "qnnpack".
      
      Reviewed By: wat3rBro
      
      Differential Revision: D29106043
      
      fbshipit-source-id: 865ac5af86919fe7b4530b48433a1bd11e295bf4
      ecf832da
  12. 08 Jul, 2021 3 commits
    • Zhicheng Yan's avatar
      fix a bug in D2GoDatasetMapper · abf2f327
      Zhicheng Yan authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/101
      
      In D2 (https://github.com/facebookresearch/d2go/commit/4f3f3401173ee842995ec69a7ce2635e2deb178a)GoDatasetMapper, when crop transform is applied to the image. "Inputs" should be updated to use the cropped images before other transforms are applied later.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29551488
      
      fbshipit-source-id: 48917ffc91c8a80286d61ba3ae8391541ec2c930
      abf2f327
    • Zhicheng Yan's avatar
      remove redundant build_optimizer() · b1e2cc56
      Zhicheng Yan authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/96
      
      In `DETRRunner`, the method `build_optimizer` customized the following logics, which are actually redundant to parent class implementation and can be removed.
      - Discount LR for certain modules, such as those with name `reference_points`, `backbone`, and `sampling_offsets`.
        - Those can be achieved by `SOLVER.LR_MULTIPLIER_OVERWRITE` after we update `get_default_optimizer_params` in `mobile-vision/d2go/d2go/optimizer/build.py`.
      - Full model gradient clipping
        - This is also implemented in `mobile-vision/d2go/d2go/optimizer/build.py`
      
      It also has minor issues
      - It ignores `SOLVER.WEIGHT_DECAY_NORM` which can set a different weight decay for affine parameters in the norm modules.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29420642
      
      fbshipit-source-id: deeb9348c9d282231c540dde6161acedd8e3a119
      b1e2cc56
    • Sam Tsai's avatar
      fix extended coco load missing comma · 4f3f3401
      Sam Tsai authored
      Summary: Fix missing comma for extended coco load, which would ignore bbox_mode and keypoints field.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29608815
      
      fbshipit-source-id: 8c737df1dfef7f88494f7de25e06b0c37742ac30
      4f3f3401
  13. 07 Jul, 2021 1 commit
  14. 06 Jul, 2021 1 commit
    • Cheng-Yang Fu's avatar
      Add the fields which will be used in point-based modeling. · 80c18641
      Cheng-Yang Fu authored
      Summary:
      Add the fields which will be used in point-based modeling.
      - `point_coords` : indicates the point_coords in the image.
      - `point_labels`: indicates the foreground or background points.
      
      Differential Revision: D29532103
      
      fbshipit-source-id: 9af6c9b049e1d05fd0d77909b09de1feec391ce9
      80c18641
  15. 02 Jul, 2021 1 commit
    • Zhicheng Yan's avatar
      revert D29048363 · e69e0ffe
      Zhicheng Yan authored
      Summary:
      In D29048363 (https://github.com/facebookresearch/d2go/commit/c480d4e4e213a850cced7758f7b62c20caad8820) we make the detaching of `reference_points` earlier in the hope of allowing more gradient flow to update weights in `self.bbox_embed`.
      In this diff, we revert the changes as i) it does not improve box AP ii) it may potential cause in-stable optimization when iterative box refinement is turned on.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29530735
      
      fbshipit-source-id: 3217c863343836e129d53e07c0eedb2db8164fe6
      e69e0ffe
  16. 01 Jul, 2021 1 commit
  17. 30 Jun, 2021 3 commits
  18. 29 Jun, 2021 3 commits
  19. 27 Jun, 2021 2 commits
    • Kai Zhang's avatar
      Move EMA weights to current device before training · 9d9f438b
      Kai Zhang authored
      Summary:
      Currently we move EMA weights to expected device right after loading from checkpoint.
      However, by the time on_load_checkpoint hook is called, current GPU device has not been assigned. This could lead to EMA weights on cuda:0 while the model is on cuda:1.
      This diff move EMA weights to device in `on_pretrain_routine_end` instead.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D28429843
      
      fbshipit-source-id: d864fb3687eb6958872300c5ec0af7ce90591f83
      9d9f438b
    • Yuxin Wu's avatar
      enable flop printing & logging at the beginning of train & test · 5509a138
      Yuxin Wu authored
      Reviewed By: zhanghang1989
      
      Differential Revision: D29379832
      
      fbshipit-source-id: 9283a8796a1dbee81b51611407c22f7d5a2069dc
      5509a138
  20. 26 Jun, 2021 1 commit
    • Kai Zhang's avatar
      Fix quantization test failure · 1894f8a3
      Kai Zhang authored
      Summary:
      # Context
      In post training quantization callback, we make a deepcopy of the Lightning module before validation start and prepare the copy with FX quantization API. The callback keeps the prepared model inside it.
      
      # The problem
      By the second time we run the validation epoch, we try to make a copy of the Lightning module, which has a reference to trainer, which has a reference to quantization callback, which has a prepared model, which is not deepcopiable.
      
      # Mitigation
      Delete the trainer before making a deepcopy.
      We're already doing that in stl/callbacks/quantization, but the changes were not ported into D2 (https://github.com/facebookresearch/d2go/commit/4169abc18ec539a24081b179fcbbc5a5754d102b)Go.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29409085
      
      fbshipit-source-id: 24550124181673b2e567b2a04563bcdfb440e145
      1894f8a3
  21. 25 Jun, 2021 3 commits
    • Haricharan Lakshman's avatar
      Freeze matched bn layers · 4169abc1
      Haricharan Lakshman authored
      Summary:
      Convert the batchnorm layers that match the specified regular expressions to FrozenBatchNorm2d.
      
      If module is an instance of batchnorm and it matches the reg exps, returns a new FrozenBatchNorm2d module.
      
      Otherwise, in-place converts the matching batchnorm child modules to FrozenBatchNorm2d
      and returns the main module.
      
      Reviewed By: ppwwyyxx
      
      Differential Revision: D29286500
      
      fbshipit-source-id: 3a20f5eeff59ddff50c42fe297eedf0ce2b909bc
      4169abc1
    • Luming Ma's avatar
      read "bbox_mode" from annotation when filtering out images with invalid bbox · 77ef0db7
      Luming Ma authored
      Summary: Some annotations are using XYXY_ABS for bbox mode so that many images were incorrectly filtered out by assuming XYWH_ABS mode. This diff read bbox_mode from annotation and convert bbox to XYWH_ABS before checking invalid bbox.
      
      Differential Revision: D29365700
      
      fbshipit-source-id: 355346b6826f401f504691090631997e169ead4a
      77ef0db7
    • Sam Tsai's avatar
      use src dataset name instead of the derived class name · d4aedb83
      Sam Tsai authored
      Summary: "@ [0-9]classes" is appended to datasets to mark whether it is a derived class of the original one and saved as a config. When reloading the config, the derived class name will be used as the source instead of the original source. Adding a check to remove the derived suffix.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D29315132
      
      fbshipit-source-id: 0cc204d305d2da6c9f1817aaf631270bd874f90d
      d4aedb83
  22. 24 Jun, 2021 1 commit
    • Zhicheng Yan's avatar
      stabilize the training of deformable DETR with box refinement · c480d4e4
      Zhicheng Yan authored
      Summary:
      Major changes
      - As described in details in appendix A.4 in deformable DETR paper (https://arxiv.org/abs/2010.04159), the gradient back-propagation is blocked at inverse_sigmoid(bounding box x/y/w/h from last decoder layer). This can be implemented by detaching tensor from compute graph in pytorch. However, currently we detach at an incorrect tensor, preventing update the layers which predicts delta x/y/w/h. Fix this bug.
      - Add more comments to annotate data types and tensor shape in the code. This should NOT affect the actual implementation.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29048363
      
      fbshipit-source-id: c5b5e89793c86d530b077a7b999769881f441b69
      c480d4e4
  23. 23 Jun, 2021 1 commit
  24. 21 Jun, 2021 1 commit
    • Yuxin Wu's avatar
      additional flop counting using fvcore's flop counter · bc9d5070
      Yuxin Wu authored
      Summary:
      1. save 3 versions of flop count, using both mobile_cv's flop counter and fvcore's flop counter
      2. print only a simple short table in terminal, but save others to files
      
      The `print_flops` function seems not used anywhere so this diff just replaced it.
      
      TODO: enable this feature automatically for train/eval workflows in the next diff
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29182412
      
      fbshipit-source-id: bfa1dfad41b99fcda06b96c4732237b5e753f1bb
      bc9d5070