1. 09 Sep, 2021 1 commit
  2. 02 Sep, 2021 1 commit
    • Zhicheng Yan's avatar
      clamp reference point max to 1.0 to avoid NaN in regressed bbox · 0a38f8c8
      Zhicheng Yan authored
      Summary:
      For training DF-DETR with swin-transformer backbone which uses large size_divisibility 224 (=32 * 7) and potentially has more zero-padding, we find the regressed box can contain NaN values and fail the assertion here (https://fburl.com/code/p27ztcce).
      
      This issue might be caused by two potential reasons.
      - Fix 1. In DF-DETR encoder, the reference points prepared by `get_reference_points()` can contain normalized x,y coordinates larger than 1 due to the rounding issues during mask interpolation across feature scales (specific examples can be given upon request LoL). Thus, we clamp max of x,y coordinates to 1.0.
      
      - Fix 2. The MLP used in bbox_embed heads contains 3 FC layers, which might be too many. We introduce an argument `BBOX_EMBED_NUM_LAYERS` to allow users to configure the number of FC layers. This change is back-compatible.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D30661167
      
      fbshipit-source-id: c7e94983bf1ec07426fdf1b9d363e5163637f21a
      0a38f8c8
  3. 25 Aug, 2021 1 commit
    • Zhicheng Yan's avatar
      fix two-stage DF-DETR · aea87f6c
      Zhicheng Yan authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/106
      
      # 2-stage DF-DETR
      
      DF-DETR supports 2-stage detection. In the 1st stage, we detect class-agnostic boxes using the feature pyramid (a.k.a. `memory` in the code) computed by the encoder.
      
      Current implementation has a few flaws
      - In `setcriterion.py`, when computing loss for encoder 1st stage predictions, `num_boxes` should be reduced across gpus and also clamped to be positive integer to avoid divide-by-zero bug. Current implementation will lead to divide-by-zero NaN issue when `num_boxes` is zero (e.g. no box annotation in the cropped input image).
      - In `gen_encoder_output_proposals()`, it manually fill in `float("inf")` at invalid spatial positions outside of actual image size. However, it is not guaranteed that those positions won't be selected as top-scored positions.  `float("inf")` can easily cause affected parameters to be updated to NaN value.
      - `class_embed` for encoder should has 1 channel rather than num_class channels because we only need to predict the probability of being a foreground box.
      
      This diff fixes the issues above.
      
      # Gradient blocking in decoder
      
      Currently, gradient of reference point is blocked at each decoding layer to improve numerical stability during training.
      In this diff, add an option `MODEL.DETR.DECODER_BLOCK_GRAD`. When False, we do NOT block the gradient. Empirically, we find this leads to better box AP.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D30325396
      
      fbshipit-source-id: 7d7add1e05888adda6e46cc6886117170daa22d4
      aea87f6c
  4. 11 Aug, 2021 1 commit
  5. 03 Aug, 2021 1 commit
  6. 01 Aug, 2021 1 commit
    • Zhicheng Yan's avatar
      stabilize deformable DETR training · a4f06b88
      Zhicheng Yan authored
      Summary:
      Deformable DETR training can be unstable due to iterative box refinement in the transformer decoder. To stabilize the training, introduce two changes
      - Remove the unnecessary use of inverse sigmoid.
      It is possible to completely avoid using inverse sigmoid when box refinement is turned on.
      - In `DeformableTransformer` class, detach `init_reference_out` before passing it into decoder to update memory and computer per-decoder-layer reference points/
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29903599
      
      fbshipit-source-id: a374ba161be0d7bcdfb42553044c4c6700e92623
      a4f06b88
  7. 29 Jul, 2021 1 commit
  8. 08 Jul, 2021 1 commit
    • Zhicheng Yan's avatar
      remove redundant build_optimizer() · b1e2cc56
      Zhicheng Yan authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/96
      
      In `DETRRunner`, the method `build_optimizer` customized the following logics, which are actually redundant to parent class implementation and can be removed.
      - Discount LR for certain modules, such as those with name `reference_points`, `backbone`, and `sampling_offsets`.
        - Those can be achieved by `SOLVER.LR_MULTIPLIER_OVERWRITE` after we update `get_default_optimizer_params` in `mobile-vision/d2go/d2go/optimizer/build.py`.
      - Full model gradient clipping
        - This is also implemented in `mobile-vision/d2go/d2go/optimizer/build.py`
      
      It also has minor issues
      - It ignores `SOLVER.WEIGHT_DECAY_NORM` which can set a different weight decay for affine parameters in the norm modules.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29420642
      
      fbshipit-source-id: deeb9348c9d282231c540dde6161acedd8e3a119
      b1e2cc56
  9. 02 Jul, 2021 1 commit
    • Zhicheng Yan's avatar
      revert D29048363 · e69e0ffe
      Zhicheng Yan authored
      Summary:
      In D29048363 (https://github.com/facebookresearch/d2go/commit/c480d4e4e213a850cced7758f7b62c20caad8820) we make the detaching of `reference_points` earlier in the hope of allowing more gradient flow to update weights in `self.bbox_embed`.
      In this diff, we revert the changes as i) it does not improve box AP ii) it may potential cause in-stable optimization when iterative box refinement is turned on.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29530735
      
      fbshipit-source-id: 3217c863343836e129d53e07c0eedb2db8164fe6
      e69e0ffe
  10. 30 Jun, 2021 1 commit
  11. 24 Jun, 2021 1 commit
    • Zhicheng Yan's avatar
      stabilize the training of deformable DETR with box refinement · c480d4e4
      Zhicheng Yan authored
      Summary:
      Major changes
      - As described in details in appendix A.4 in deformable DETR paper (https://arxiv.org/abs/2010.04159), the gradient back-propagation is blocked at inverse_sigmoid(bounding box x/y/w/h from last decoder layer). This can be implemented by detaching tensor from compute graph in pytorch. However, currently we detach at an incorrect tensor, preventing update the layers which predicts delta x/y/w/h. Fix this bug.
      - Add more comments to annotate data types and tensor shape in the code. This should NOT affect the actual implementation.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29048363
      
      fbshipit-source-id: c5b5e89793c86d530b077a7b999769881f441b69
      c480d4e4
  12. 20 Jun, 2021 1 commit
    • Albert Pumarola's avatar
      Add unittest for DETR runner · 54b352d9
      Albert Pumarola authored
      Summary: Add create and train unit tests to OSS runner
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29254417
      
      fbshipit-source-id: f7c52b90b2bc7afa83a204895be149664c675e52
      54b352d9
  13. 12 Jun, 2021 1 commit
  14. 06 Apr, 2021 1 commit
    • Hang Zhang's avatar
      Fix version test in detr · efe74d2a
      Hang Zhang authored
      Summary: TorchVision recently upgrade their version to 0.10.0 which causes issues in the version check in detr.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D27575085
      
      fbshipit-source-id: 75f459fe7a711161e908609fcf2f2d28a01a6c74
      efe74d2a
  15. 03 Mar, 2021 1 commit