1. 05 Aug, 2021 2 commits
    • Abduallah Mohamed's avatar
      Clarifying the use of do_test function · 610d2d03
      Abduallah Mohamed authored
      Summary: The `do_test` method might be used to perform testing outside the training process. One might think it will load the weights of the models before testing similar to `do_train` method. This diff adds a comment that clarifies this confusion.
      
      Reviewed By: ppwwyyxx
      
      Differential Revision: D29082338
      
      fbshipit-source-id: 6ec7d7f7f243503414fa904f4eb8856e62e9ed6d
      610d2d03
    • Yuxin Wu's avatar
      avoid warnings of NCCL · 30d5ca55
      Yuxin Wu authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/detectron2/pull/3322
      
      avoid warnings like the following:
      ```
      [W ProcessGroupNCCL.cpp:1569] Rank 0 using best-guess GPU 0 to perform barrier as devices used by
      this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is
      incorrect. Specify device_ids in barrier() to force use of a particular device.
      ```
      
      maybe can fix the hang in https://github.com/facebookresearch/detectron2/issues/3319
      
      Reviewed By: vaibhava0
      
      Differential Revision: D30077957
      
      fbshipit-source-id: b8827e66c5eecc06b650acde2e7ff44106327f69
      30d5ca55
  2. 04 Aug, 2021 1 commit
  3. 03 Aug, 2021 3 commits
  4. 01 Aug, 2021 1 commit
    • Zhicheng Yan's avatar
      stabilize deformable DETR training · a4f06b88
      Zhicheng Yan authored
      Summary:
      Deformable DETR training can be unstable due to iterative box refinement in the transformer decoder. To stabilize the training, introduce two changes
      - Remove the unnecessary use of inverse sigmoid.
      It is possible to completely avoid using inverse sigmoid when box refinement is turned on.
      - In `DeformableTransformer` class, detach `init_reference_out` before passing it into decoder to update memory and computer per-decoder-layer reference points/
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29903599
      
      fbshipit-source-id: a374ba161be0d7bcdfb42553044c4c6700e92623
      a4f06b88
  5. 29 Jul, 2021 1 commit
  6. 21 Jul, 2021 1 commit
    • Xi Yin's avatar
      fix bug in valid_bbox check · b4d9aad9
      Xi Yin authored
      Summary: In case the height/width is None, the original version will cause a crash. So adding additional check to bypass this issue.
      
      Reviewed By: ppwwyyxx
      
      Differential Revision: D29807853
      
      fbshipit-source-id: b2b1a7edb52b7911da79a11329d4cf93f343c279
      b4d9aad9
  7. 14 Jul, 2021 1 commit
  8. 09 Jul, 2021 2 commits
    • Mircea Cimpoi's avatar
      Add tests for exporter / boltnn export via torch delegate · d0c38c43
      Mircea Cimpoi authored
      Summary:
      Adding test for previous diff.
      Boltnn backend is supported on device -- so this test only checks if the conversion takes place and the output file is present.
      
      Differential Revision: D29589245
      
      fbshipit-source-id: ba66a733295304531d177086ce6459a50cfbaa07
      d0c38c43
    • Mircea Cimpoi's avatar
      Add BoltNN conversion to d2go exporter · ecf832da
      Mircea Cimpoi authored
      Summary:
      Added predictor_type `boltnn_int8` to export to BoltNN via torch delegate.
      
      - `int8` needs to be in the name, otherwise the post-train quantization won't happen;
      
      ```
      cfg.QUANTIZATION.BACKEND = "qnnpack"
      // cfg.QUANTIZATION.CUSTOM_QSCHEME = "per_tensor_affine"
      ```
      
      Seems that ` QUANTIZATION.CUSTOM_QSCHEME per_tensor_affine` is not needed - likely covered by "qnnpack".
      
      Reviewed By: wat3rBro
      
      Differential Revision: D29106043
      
      fbshipit-source-id: 865ac5af86919fe7b4530b48433a1bd11e295bf4
      ecf832da
  9. 08 Jul, 2021 3 commits
    • Zhicheng Yan's avatar
      fix a bug in D2GoDatasetMapper · abf2f327
      Zhicheng Yan authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/101
      
      In D2 (https://github.com/facebookresearch/d2go/commit/4f3f3401173ee842995ec69a7ce2635e2deb178a)GoDatasetMapper, when crop transform is applied to the image. "Inputs" should be updated to use the cropped images before other transforms are applied later.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29551488
      
      fbshipit-source-id: 48917ffc91c8a80286d61ba3ae8391541ec2c930
      abf2f327
    • Zhicheng Yan's avatar
      remove redundant build_optimizer() · b1e2cc56
      Zhicheng Yan authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/96
      
      In `DETRRunner`, the method `build_optimizer` customized the following logics, which are actually redundant to parent class implementation and can be removed.
      - Discount LR for certain modules, such as those with name `reference_points`, `backbone`, and `sampling_offsets`.
        - Those can be achieved by `SOLVER.LR_MULTIPLIER_OVERWRITE` after we update `get_default_optimizer_params` in `mobile-vision/d2go/d2go/optimizer/build.py`.
      - Full model gradient clipping
        - This is also implemented in `mobile-vision/d2go/d2go/optimizer/build.py`
      
      It also has minor issues
      - It ignores `SOLVER.WEIGHT_DECAY_NORM` which can set a different weight decay for affine parameters in the norm modules.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29420642
      
      fbshipit-source-id: deeb9348c9d282231c540dde6161acedd8e3a119
      b1e2cc56
    • Sam Tsai's avatar
      fix extended coco load missing comma · 4f3f3401
      Sam Tsai authored
      Summary: Fix missing comma for extended coco load, which would ignore bbox_mode and keypoints field.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29608815
      
      fbshipit-source-id: 8c737df1dfef7f88494f7de25e06b0c37742ac30
      4f3f3401
  10. 07 Jul, 2021 1 commit
  11. 06 Jul, 2021 1 commit
    • Cheng-Yang Fu's avatar
      Add the fields which will be used in point-based modeling. · 80c18641
      Cheng-Yang Fu authored
      Summary:
      Add the fields which will be used in point-based modeling.
      - `point_coords` : indicates the point_coords in the image.
      - `point_labels`: indicates the foreground or background points.
      
      Differential Revision: D29532103
      
      fbshipit-source-id: 9af6c9b049e1d05fd0d77909b09de1feec391ce9
      80c18641
  12. 02 Jul, 2021 1 commit
    • Zhicheng Yan's avatar
      revert D29048363 · e69e0ffe
      Zhicheng Yan authored
      Summary:
      In D29048363 (https://github.com/facebookresearch/d2go/commit/c480d4e4e213a850cced7758f7b62c20caad8820) we make the detaching of `reference_points` earlier in the hope of allowing more gradient flow to update weights in `self.bbox_embed`.
      In this diff, we revert the changes as i) it does not improve box AP ii) it may potential cause in-stable optimization when iterative box refinement is turned on.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29530735
      
      fbshipit-source-id: 3217c863343836e129d53e07c0eedb2db8164fe6
      e69e0ffe
  13. 01 Jul, 2021 1 commit
  14. 30 Jun, 2021 3 commits
  15. 29 Jun, 2021 3 commits
  16. 27 Jun, 2021 2 commits
    • Kai Zhang's avatar
      Move EMA weights to current device before training · 9d9f438b
      Kai Zhang authored
      Summary:
      Currently we move EMA weights to expected device right after loading from checkpoint.
      However, by the time on_load_checkpoint hook is called, current GPU device has not been assigned. This could lead to EMA weights on cuda:0 while the model is on cuda:1.
      This diff move EMA weights to device in `on_pretrain_routine_end` instead.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D28429843
      
      fbshipit-source-id: d864fb3687eb6958872300c5ec0af7ce90591f83
      9d9f438b
    • Yuxin Wu's avatar
      enable flop printing & logging at the beginning of train & test · 5509a138
      Yuxin Wu authored
      Reviewed By: zhanghang1989
      
      Differential Revision: D29379832
      
      fbshipit-source-id: 9283a8796a1dbee81b51611407c22f7d5a2069dc
      5509a138
  17. 26 Jun, 2021 1 commit
    • Kai Zhang's avatar
      Fix quantization test failure · 1894f8a3
      Kai Zhang authored
      Summary:
      # Context
      In post training quantization callback, we make a deepcopy of the Lightning module before validation start and prepare the copy with FX quantization API. The callback keeps the prepared model inside it.
      
      # The problem
      By the second time we run the validation epoch, we try to make a copy of the Lightning module, which has a reference to trainer, which has a reference to quantization callback, which has a prepared model, which is not deepcopiable.
      
      # Mitigation
      Delete the trainer before making a deepcopy.
      We're already doing that in stl/callbacks/quantization, but the changes were not ported into D2 (https://github.com/facebookresearch/d2go/commit/4169abc18ec539a24081b179fcbbc5a5754d102b)Go.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29409085
      
      fbshipit-source-id: 24550124181673b2e567b2a04563bcdfb440e145
      1894f8a3
  18. 25 Jun, 2021 3 commits
    • Haricharan Lakshman's avatar
      Freeze matched bn layers · 4169abc1
      Haricharan Lakshman authored
      Summary:
      Convert the batchnorm layers that match the specified regular expressions to FrozenBatchNorm2d.
      
      If module is an instance of batchnorm and it matches the reg exps, returns a new FrozenBatchNorm2d module.
      
      Otherwise, in-place converts the matching batchnorm child modules to FrozenBatchNorm2d
      and returns the main module.
      
      Reviewed By: ppwwyyxx
      
      Differential Revision: D29286500
      
      fbshipit-source-id: 3a20f5eeff59ddff50c42fe297eedf0ce2b909bc
      4169abc1
    • Luming Ma's avatar
      read "bbox_mode" from annotation when filtering out images with invalid bbox · 77ef0db7
      Luming Ma authored
      Summary: Some annotations are using XYXY_ABS for bbox mode so that many images were incorrectly filtered out by assuming XYWH_ABS mode. This diff read bbox_mode from annotation and convert bbox to XYWH_ABS before checking invalid bbox.
      
      Differential Revision: D29365700
      
      fbshipit-source-id: 355346b6826f401f504691090631997e169ead4a
      77ef0db7
    • Sam Tsai's avatar
      use src dataset name instead of the derived class name · d4aedb83
      Sam Tsai authored
      Summary: "@ [0-9]classes" is appended to datasets to mark whether it is a derived class of the original one and saved as a config. When reloading the config, the derived class name will be used as the source instead of the original source. Adding a check to remove the derived suffix.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D29315132
      
      fbshipit-source-id: 0cc204d305d2da6c9f1817aaf631270bd874f90d
      d4aedb83
  19. 24 Jun, 2021 1 commit
    • Zhicheng Yan's avatar
      stabilize the training of deformable DETR with box refinement · c480d4e4
      Zhicheng Yan authored
      Summary:
      Major changes
      - As described in details in appendix A.4 in deformable DETR paper (https://arxiv.org/abs/2010.04159), the gradient back-propagation is blocked at inverse_sigmoid(bounding box x/y/w/h from last decoder layer). This can be implemented by detaching tensor from compute graph in pytorch. However, currently we detach at an incorrect tensor, preventing update the layers which predicts delta x/y/w/h. Fix this bug.
      - Add more comments to annotate data types and tensor shape in the code. This should NOT affect the actual implementation.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29048363
      
      fbshipit-source-id: c5b5e89793c86d530b077a7b999769881f441b69
      c480d4e4
  20. 23 Jun, 2021 1 commit
  21. 21 Jun, 2021 1 commit
    • Yuxin Wu's avatar
      additional flop counting using fvcore's flop counter · bc9d5070
      Yuxin Wu authored
      Summary:
      1. save 3 versions of flop count, using both mobile_cv's flop counter and fvcore's flop counter
      2. print only a simple short table in terminal, but save others to files
      
      The `print_flops` function seems not used anywhere so this diff just replaced it.
      
      TODO: enable this feature automatically for train/eval workflows in the next diff
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29182412
      
      fbshipit-source-id: bfa1dfad41b99fcda06b96c4732237b5e753f1bb
      bc9d5070
  22. 20 Jun, 2021 1 commit
    • Albert Pumarola's avatar
      Add unittest for DETR runner · 54b352d9
      Albert Pumarola authored
      Summary: Add create and train unit tests to OSS runner
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D29254417
      
      fbshipit-source-id: f7c52b90b2bc7afa83a204895be149664c675e52
      54b352d9
  23. 19 Jun, 2021 2 commits
    • Yanghan Wang's avatar
      enable hive loader for person segmentation · 58f0ae3d
      Yanghan Wang authored
      Reviewed By: leitian
      
      Differential Revision: D28363172
      
      fbshipit-source-id: e69a71e6525dc9b76171b0cdc5f55ee8d188d6cc
      58f0ae3d
    • Fu-Chen Chen's avatar
      fix bug when checking for invalid bounding boxes · de0829f1
      Fu-Chen Chen authored
      Summary:
      The dict `record` might not have keys `"width"` or `"height"`.
      This diff check if `"width"` and `"height"` are in the dict `record` before getting the values.
      
      Reviewed By: sstsai-adl
      
      Differential Revision: D29243341
      
      fbshipit-source-id: a1e0e343dd1afcced834c3732e64bb6f372fbd1a
      de0829f1
  24. 16 Jun, 2021 3 commits
    • Luis Perez's avatar
      Synchronize PyTorchLightning/pytorch-lightning (revision f7459f53@master) to... · 670b4c4a
      Luis Perez authored
      Synchronize PyTorchLightning/pytorch-lightning (revision f7459f53@master) to github/third-party/PyTorchLightning/pytorch-lightning
      
      Summary:
      ## OSS
      Note these issues are being solved in OSS here: https://github.com/PyTorchLightning/pytorch-lightning/pull/7994/files#
      
      ## Manual
      - `speed_monitor.py` - `Result.unpack_batch_size` has been removed, moved to new implementation.
      - `fully_sharded.py` - There was a refactor for plugins, so updated corresponding function to keep reduced memory usage.
      - `hive_writing_classy.py`, `hive_writing_faim.py`, `hive_writing_xrayvideo.py` - Same as `speed_monitor.py`.
      - [Temporary] Uncommented misconfiguration exception. See https://github.com/PyTorchLightning/pytorch-lightning/pull/7882#pullrequestreview-683282719.
      - Update `TestModel` to detach appropriately.
      - Manually `detach` metrics stored in ResultStore.
      
      ## Automatic
      ### New commit log messages
        f7459f53 DeepSpeed Infinity Update (#7234)
        03e7bdf8 Improve `LightningModule` hook tests (#7944)
        3a0ed02b Properly handle parent modules w/ parameters in `BaseFinetuning` callback (#7931)
        ce93d8bc Handle errors due to uninitailized parameters (#7642)
        cca0e753 remove parsing comments (#7958)
        898fb56b added on_test_start() documentation (#7962)
        22d82661 Seed all workers when using DDP (#7942)
        436fc53c Improve `LightningDataModule` hook test and fix `dataloader_idx` argument (#7941)
        6b7b4047 deprecate hpc_load() and integrate it with restore() (#7955)
        20a5e09e fix myst-parser warning blocking docs ci (#7967)
        f15ea601 update chlog + legacy chpt (#7954)
        59d0c656 Add dataclass support to `apply_to_collection` (#7935)
        cdd01f32 LightningCLI support for argument links applied on instantiation (#7895)
        6856cced Remove rank_zero_only on DataModule prepare_data (#7945)
        96433d03 IPU Integration 5/5 (#7867)
        42c7f272 refactor checkpoint loading for training type plugins (#7928)
        ac4eb0a0 `is_overridden` improvements (#7918)
        9e932f4d Delete `on_after_backward` unused argument (#7925)
        8b738693 Deprecate the default `EarlyStopping` callback monitor value (#7907)
        c1eac483 split `restore_training_state` into logical parts [2 / 2] (#7900)
        d209b689 split `restore_training_state` into logical parts [1 / 2] (#7901)
        111287b4 add pre-commit hooks (#7906)
        839019a3 Remove legacy teardown check in train loop (#7917)
        b45a89a2 Clean-up after logger connector redesign 2/2 (#7631)
        07b69231 Remove fn check for ipu output (#7915)
        580a3b5e Remove dead code (#7910)
        df812398 Clean-up after logger connector redesign 1/2 (#7909)
        ec4f8856 Enable logger connector re-design (#7891)
        15be9865 add logger to __all__ (#6854)
        6fee9262 Deprecate `LightningDataModule` lifecycle properties (#7657)
        764d2c77 refactor CheckpointConnector.restore_weights  (#7862)
        7f4ef6d1 Fix logs overwriting issue for remote fs (#7889)
        c310ce66 Logger connector re-design `_Metadata.reduce_fx` fixes. (#7890)
        b214442e New logger connector code (#7882)
      
      Reviewed By: yifuwang
      
      Differential Revision: D29105294
      
      fbshipit-source-id: 990b2a4a7333908d676de193f5ec930cb50b8a19
      670b4c4a
    • Kai Zhang's avatar
      Log D2Go model instantiation events · 14b25e8d
      Kai Zhang authored
      Summary: This diff logs D2 (https://github.com/facebookresearch/d2go/commit/692a4fb3c506aeebbb49070a20d139d617381b19)Go model instantiation events to table scuba_caffe2_pytorch_usage_stats, so that we could track model usage in fblearner, bento, local scripts, etc.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D28986723
      
      fbshipit-source-id: 3e865354e5884c9e82bd1b08819cc10d349f93bd
      14b25e8d
    • Sam Tsai's avatar
      add segmentation points and use circular kp pattern · dcdf3dcf
      Sam Tsai authored
      Summary:
      1. Circular pattern segmentation points
      2. Use circular pattern for kp patterns
      
      Reviewed By: wat3rBro
      
      Differential Revision: D29069224
      
      fbshipit-source-id: c4c01d6d93de5abbdfceae07f1cd48fb56e05f57
      dcdf3dcf