Commits · 74a031b8eae33b2a92073285d8e86b4d26ad882b · OpenDAS / d2go

20 Oct, 2021 5 commits

Yuxin Wu authored Oct 20, 2021

Summary: helps debugging

Reviewed By: zhanghang1989

Differential Revision: D31806396

fbshipit-source-id: 870308990c4c0c71453d107628b8adcb9edcf391

74a031b8

toy example of training model for turing · ee9602a1

Yanghan Wang authored Oct 20, 2021

Summary:
Add toy example to illustrate the Turing workflow.
- modify the model building, add converting to helios step. Note that we need to hide this from OSS, so create FB version of the runner, in order to modify `build_model` and `get_default_cfg`.
- make the `D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)GoCompatibleMNISTRunner` up-to-date, and use the "tutorial" meta-arch for writing unit test since it's the simplest model. Note that even `TutorialNet` is very simple, there's still a constraint that the FC has to run on 4D tensor with 1x1 spatial dimension because it's been mapped to 1x1 Conv by Helios, modify the `TutorialNet` to make it compatible.

Reviewed By: newstzpz

Differential Revision: D31705305

fbshipit-source-id: 77949dfbf08252be5495e9273210274c8ad86abb

ee9602a1

use fb. in import path - modeling/backbone/fb · 274d3b49

Yanghan Wang authored Oct 20, 2021

Summary: see bottom diff

Reviewed By: newstzpz

Differential Revision: D31780235

fbshipit-source-id: ec1285c4c5457a631e1eb88bebd47c9f41b47e12

274d3b49

Supported learnable qat. · f6ce583e

Peizhao Zhang authored Oct 20, 2021

Summary:
Supported learnable qat.
* Added a config key `QUANTIZATION.QAT.FAKE_QUANT_METHOD` to specify the qat metod (`default` or `learnable`).
* Added a config key `QUANTIZATION.QAT.ENABLE_LEARNABLE_OBSERVER_ITER` to specify the start iteration for learnable observers (before that it is using static observers).
* Custom quantization code needs to call ` d2go.utils.qat_utils.get_qat_qconfig()` to get proper qconfig for learnable qat. An exception will raise if qat method is learnable but no learnable observers are used in the model.
* Set the weight decay for scale/zero_point to 0 for the optimizer automatically.
* The way to use larnable qat: enable static observers -> enable fake quant -> enable learnable observers -> freeze bn.

Differential Revision: D31370822

fbshipit-source-id: a5a5044a539d0d7fe1cc6b36e6821fc411ce752a

f6ce583e

Refactored qat related code. · ef9c20cc

Peizhao Zhang authored Oct 20, 2021

Summary:
Refactored qat related code.
* Moved `_prepare_model_for_qat` related code to a function.
* Moved `_setup_non_qat_to_qat_state_dict_map` related code to a function.
* Moved QATHook related code to the quantization file and implemented as a class.

Differential Revision: D31370819

fbshipit-source-id: 836550b2c8d68cd93a84d5877ad9cef6f0f0eb39

ef9c20cc

16 Oct, 2021 1 commit

add FCOS to d2go · bfc08c53

Yuxin Wu authored Oct 15, 2021

Summary: D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8) does not add new yacs config for new models, but this simple wrapper with configs->arguments mapping is enough to make the model work with yacs config.

Reviewed By: zhanghang1989

Differential Revision: D30980180

fbshipit-source-id: 75a0cc66051800a3e9d553bb650ca5c900c0ffa3

bfc08c53

15 Oct, 2021 2 commits

Supported specifying customized parameter groups from model. · 87ce583c

Peizhao Zhang authored Oct 14, 2021

Summary:
Supported specifying customized parameter groups from model.
* Allow model to specify customized parameter groups by implementing a function `model.get_optimizer_param_groups(cfg)`
* Supported model with ddp.

Reviewed By: zhanghang1989

Differential Revision: D31289315

fbshipit-source-id: c91ba8014508e9fd5f172601b9c1c83c188338fd

87ce583c

Refactor for get_optimizer_param_groups. · 2dc3bc02

Peizhao Zhang authored Oct 14, 2021

Summary:
Refactor for get_optimizer_param_groups.
* Split `get_default_optimizer_params()` into multiple functions:
  * `get_optimizer_param_groups_default()`
  * `get_optimizer_param_groups_lr()`
  * `get_optimizer_param_groups_weight_decay()`
* Regroup the parameters to create the minimal amount of groups.
* Print all parameter groups when the optimizer is created.
    Param group 0: {amsgrad: False, betas: (0.9, 0.999), eps: 1e-08, lr: 10.0, params: 1, weight_decay: 1.0}
    Param group 1: {amsgrad: False, betas: (0.9, 0.999), eps: 1e-08, lr: 1.0, params: 1, weight_decay: 1.0}
    Param group 2: {amsgrad: False, betas: (0.9, 0.999), eps: 1e-08, lr: 1.0, params: 2, weight_decay: 0.0}
* Add some unit tests.

Reviewed By: zhanghang1989

Differential Revision: D31287783

fbshipit-source-id: e87df0ae0e67343bb2130db945d8faced44d7411

2dc3bc02

14 Oct, 2021 1 commit

update benchmark_storage with instructions · 46f16a5e

Yuxin Wu authored Oct 13, 2021

Summary: Also modify launch() because it should not assume it's always called with a CfgNode object.

Differential Revision: D31494215

fbshipit-source-id: 8f07e9cb64969f8a14641956f7ef7c7160748bd9

46f16a5e

13 Oct, 2021 2 commits

Fix error when running on cpu with MODEL.DEVICE="cpu" · 8bba1fc6

Daniel Haziza authored Oct 13, 2021

Summary: The assert just below fails because `backend = "NCCL"` and we don't have a GPU

Reviewed By: ppwwyyxx

Differential Revision: D31506095

fbshipit-source-id: c1199eeb732d098c02fe5cd40efb85284deaa3b9

8bba1fc6

remove unused functions in default_runner · 004be54f

Yanghan Wang authored Oct 13, 2021

Summary: No usage: https://www.internalfb.com/code/search?q=filepath%3Ad2go%2F%20repo%3Afbcode%20_mock_func

Differential Revision: D31591868

fbshipit-source-id: 3fc6103c40713fa7bf278fd57a3e8fb4436a0902

004be54f

09 Oct, 2021 1 commit

fix real data driving generation in _generate() · 3b23dd39

Tao Xu authored Oct 08, 2021

Summary: Fix a failure bug in real image driving generating

Reviewed By: yc-fb

Differential Revision: D31362721

fbshipit-source-id: b222745aada1bd6680ca931d49a70d8b428828a6

3b23dd39

07 Oct, 2021 2 commits

only evaluate EMA model on non-predictor models · d99428a1

Yanghan Wang authored Oct 06, 2021

Summary:
EMA is only applicable when testing non-predictor based models, this diff simply add a check so it won't evaluate ema models.

Side note: `do_test` should probably just handle single model, in the case of EMA, we could let `do_train` to return two models with and without ema, and call `do_test` on each of them. Then the temporary fix in this diff is not needed at all.

Reviewed By: wrlife

Differential Revision: D31450572

fbshipit-source-id: 8696922a9fd194f91315d2f3480dc8bfd8f36a3d

d99428a1

remove SOLVER.STEPS from configs · 79ea94d5

Yuxin Wu authored Oct 06, 2021

Summary:
the LR scheduler is cosine, so this config has no effect.
Remove it to avoid confusion.

Reviewed By: sstsai-adl

Differential Revision: D31444047

fbshipit-source-id: b40e0d7d923c3b55dfe23353050ea0238b3afd16

79ea94d5

06 Oct, 2021 1 commit

Update callsites in mobile-vision · fa24368f

Supriya Rao authored Oct 05, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/124

Update callsites from torch.quantization to torch.ao.quantization

Reviewed By: z-a-f, jerryzh168

Differential Revision: D31286125

fbshipit-source-id: ef24ca87d8db398c65bb5b89f035afe0423a5685

fa24368f

01 Oct, 2021 2 commits

Facebook: Reward Function in D2Go · 9dc1600b

Hang Zhang authored Oct 01, 2021

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/116

Reviewed By: newstzpz

Differential Revision: D30860098

fbshipit-source-id: 5c9422dd91d305193f9b43869f12423660217010

9dc1600b

make tensorboardx logging overridable. · c2b397b1

Sam Tsai authored Sep 30, 2021

Summary:
Add get_tbx_writer to runner class and call that in the do_train. Make tbx writer overridable.

(see D31289763 for a use case)

Reviewed By: zhanghang1989

Differential Revision: D31289763

fbshipit-source-id: 19ddbbe8df62f9da0640f595532cd8f1296e3be8

c2b397b1

27 Sep, 2021 2 commits

support scripting for torchscript ExportMethod · a9dce74e

Yanghan Wang authored Sep 27, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/118

This diff adds the proper support for using scripting when exporting model.

Rename tracing-related code:
- Previously `trace_and_save_torchscript` is the primary function to export model, replace it with `export_optimize_and_save_torchscript`.
- Also rename `D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)TorchscriptTracingExport` to `TracingAdaptedTorchscriptExport` since it's not only for tracing now.

Introduce `jit_mode`:
- Add `jit_mode` option as the `export_kwargs` of ExportMethod.
- Add `scripting` and `tracing` trigger words to overwrite `jit_mode`. Please note that the `tracing` now applies to all models, which is different from the previous meaning (using `TracingAdapter` for RCNN).
- Therefore there're two ways of using scripting mode, 1) setting `jit_mode` in prepare_for_export; 2) using `script` trigger words. Add unit test as examples to illustrate two ways.
- Don't use `TracingAdapter` when scripting since it's not scriptable.

Consolidate triggering words logic.
- Group logic of handling trigger words (eg. `_mobile`, `_int8`, `scripting`, `tracing`) into a single decorator `update_export_kwargs_from_export_method` for better structuring and readability.

Reviewed By: zhanghang1989

Differential Revision: D31181624

fbshipit-source-id: 5fbb0d4fa4c29ffa4a761af8ea8f93b4bad4cef9

a9dce74e

don't register @legacy as part of export method name · 8adb146e

Yanghan Wang authored Sep 27, 2021

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/119

Reviewed By: zhanghang1989

Differential Revision: D31181216

fbshipit-source-id: 428116f4f4144e20410222825a9a00f75253ef4a

8adb146e

24 Sep, 2021 5 commits

deprecate terminate_on_nan in pytorch lightning's default trainer config · 1ce9e124

Lei Tian authored Sep 24, 2021

Summary: deprecate terminate_on_nan in pytorch lightning's default trainer config

Reviewed By: kazhang, wat3rBro

Differential Revision: D30910709

fbshipit-source-id: cb22c1f5f1cf3a3236333f21be87756d3f657f78

1ce9e124

Fix CI failure on github · 79ae5bbd

Hang Zhang authored Sep 24, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/117

Fix github ci failure due to lack of coco datset. It was cased by D31134064 (https://github.com/facebookresearch/d2go/commit/f018d4a7ceef437d8fc3ca8b2bba4b7321917e06)

Reviewed By: mattcyu1, wat3rBro

Differential Revision: D31179666

fbshipit-source-id: fe25129d167afcdcb577e5c8d82f3432ba939ca9

79ae5bbd

make exporting rcnn model using torchvision ops the default option · f018d4a7

Yanghan Wang authored Sep 23, 2021

Reviewed By: zhanghang1989

Differential Revision: D31134064

fbshipit-source-id: 825ca14477243a53f84b8521f4430a2b080324bd

f018d4a7

only wrap model with TracingAdapter when necessary · 3ce16d73

Yanghan Wang authored Sep 23, 2021

Summary:
D31134064 changes the default ExportMethod from `DefaultTorchscriptExport` to `D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)TorchscriptTracingExport` for all models. Without change, all models will be wrapped using `TracingAdapter`, which might cause unexpected effects (eg. it's not scripting friendly).

This diff add check for input/output data structure and only wrap the model when necessary.

Reviewed By: zhanghang1989

Differential Revision: D31136261

fbshipit-source-id: 4a8ffc986a5c5d61c493dd4ba0eb185aa0d54f38

3ce16d73

don't log flop counter exception to terminal · dac9a358

Yuxin Wu authored Sep 23, 2021

Summary: write to file instead.

Reviewed By: sstsai-adl

Differential Revision: D31151549

fbshipit-source-id: 728e68182cedd625cdbe057da4162a441b80c2a4

dac9a358

22 Sep, 2021 1 commit

fix optimizer setting in pytorch lightning · ea6e9f7f

Lei Tian authored Sep 22, 2021

Summary: fix optimizer setting in pytorch lightning

Reviewed By: wat3rBro

Differential Revision: D30988441

fbshipit-source-id: fcd2f4c77a87a790d7e99b0e3c833c291fd66e77

ea6e9f7f

21 Sep, 2021 2 commits

Log total_loss during training · d07a58f3

Georgy Marrero authored Sep 21, 2021

Summary: This diff adds the sum of all the losses as `total_loss` and logs it.

Reviewed By: kazhang

Differential Revision: D31063260

fbshipit-source-id: 3012dd49dd8f5fc60a7c32f3ad7a3477d2b6f5a0

d07a58f3

change default QUANTIZATION.MODULES to [] · 80b6098f

Yanghan Wang authored Sep 21, 2021

Summary: Might causing https://github.com/facebookresearch/d2go/issues/113.

Reviewed By: kazhang

Differential Revision: D31066641

fbshipit-source-id: 563c2cb255b1cca4a12c8adfafc7380f140efde5

80b6098f

20 Sep, 2021 2 commits

merge internal data build files · 07c4e54c

Yanghan Wang authored Sep 20, 2021

Reviewed By: ppwwyyxx

Differential Revision: D31035247

fbshipit-source-id: 7340e6f6bb813e284416e37060d0d511c5c79e03

07c4e54c

Check if new_ds_name registered to MetadataCatalog before removing · f4fcff31

Shiyu Dong authored Sep 20, 2021

Summary:
As title, sometimes new_ds_name is not registered so it crashes the program when calling remove(). Adding a check.
A side effect to this is if it's not registered, get() method will register it first and then remove() will remove it from registery.

Reviewed By: ppwwyyxx

Differential Revision: D31049303

fbshipit-source-id: 149168fb89fd3b661b60717ff2aafa7a9bd52849

f4fcff31

18 Sep, 2021 2 commits

Integrating model profiler into d2go NAS task · e992359c

Hang Zhang authored Sep 18, 2021

Reviewed By: larryliu0820

Differential Revision: D30390706

fbshipit-source-id: 49f83f884f497df227448f7e59903bd1bd6e5484

e992359c

show stack trace when export errors happen · 81328bf2

Yuxin Wu authored Sep 17, 2021

Differential Revision: D30973518

fbshipit-source-id: fbdfb862ab23d5141553499471f92d2218addf91

81328bf2

15 Sep, 2021 2 commits

Relax bounding box check assertion for FP16 · 16f35a75

Valentin Andrei authored Sep 14, 2021

Reviewed By: stephenyan1231

Differential Revision: D30827134

fbshipit-source-id: e0fcb3b5f62d52283c08870dc9062c2086faf163

16f35a75

Fix LR auto-scale for multi-tensor optimizers · b9aa4855

Valentin Andrei authored Sep 14, 2021

Reviewed By: stephenyan1231, zhanghang1989

Differential Revision: D30903817

fbshipit-source-id: 578e6b02a1bd59b1bd841399fc60111d320ae9aa

b9aa4855

10 Sep, 2021 1 commit

Make adding visualization wrapper a separate classmethod · 3fd2e635

Tong Xiao authored Sep 09, 2021

Summary: To make it easier for reuse

Reviewed By: HarounH, wat3rBro

Differential Revision: D30813080

fbshipit-source-id: 79eccbf7f16610e1050c461cd687568bdc262706

3fd2e635

09 Sep, 2021 1 commit

enable black for mobile-vision · 82295dbf

Yanghan Wang authored Sep 08, 2021

Summary:
https://fb.workplace.com/groups/pythonfoundation/posts/2990917737888352

Remove `mobile-vision` from opt-out list; leaving `mobile-vision/SNPE` opted out because of 3rd-party code.

arc lint --take BLACK --apply-patches --paths-cmd 'hg files mobile-vision'

allow-large-files

Reviewed By: sstsai-adl

Differential Revision: D30721093

fbshipit-source-id: 9e5c16d988b315b93a28038443ecfb92efd18ef8

82295dbf

08 Sep, 2021 1 commit
- unifying dataset registration for semantic segmentation · a56c7e15
  Yanghan Wang authored Sep 07, 2021
```
Differential Revision: D30624781

fbshipit-source-id: 6538813c886ffb9eae2e1d88d500f34c61cae5c0
```
  a56c7e15
02 Sep, 2021 2 commits

Increase limit on number of detections per image in {COCO,LVIS}Evaluator · 2fb273ab

Lydia Chan authored Sep 02, 2021

Summary:
## Context
- The current limit on the number of detections per image (`K`) in LVIS is 300.
- Implementing AP_pool/AP_fixed requires removing this default limit on `K`
- [Literature](https://arxiv.org/pdf/2102.01066.pdf) has shown that increasing `K` correlates with AP gains

## This Diff
- Changed limit on number of detections per image (`K`) to be customizable for LVIS and COCO through `TEST.DETECTIONS_PER_IMAGE` in the config
   - For COCO:
       - Maintain the default `max_dets_per_image` to be [1, 10, 100] as from [COCOEval](https://www.internalfb.com/code/fbsource/[88bb57c3054a]/fbcode/deeplearning/projects/cocoApi/PythonAPI/pycocotools/cocoeval.py?lines=28-29)
       - Allow users to input a custom integer for `TEST.DETECTIONS_PER_IMAGE` in the config, and use  [1, 10, `TEST.DETECTIONS_PER_IMAGE`] for COCOEval
   - For LVIS:
       - Maintain the default `max_dets_per_image` to be 300 as from [LVISEval](https://www.internalfb.com/code/fbsource/[f6b86d023721]/fbcode/deeplearning/projects/lvisApi/lvis/eval.py?lines=528-529)
       - Allow users to input a custom integer for `TEST.DETECTIONS_PER_IMAGE` in the config, and use this in LVISEval
- Added `COCOevalMaxDets` for evaluating AP with the custom limit on number of detections per image (since default `COCOeval` uses 100 as limit on detections per image for evaluating AP)

## Inference Runs using this Diff
- Performed inference using `K = {300, 1000, 10000, 100000}`
- Launched fblearner flows for object detector baseline models with N1055536 (LVIS) and N1055756 (COCO)
  - Recorded [results of running inference](https://docs.google.com/spreadsheets/d/1rgdjN2KvxcYfKCkGUC4tMw0XQJ5oZL0dwjOIh84YRg8/edit?usp=sharing)

Reviewed By: ppwwyyxx

Differential Revision: D30077359

fbshipit-source-id: 372eb5e0d7c228fb77fe23bf80d53597ec66287b

2fb273ab

clamp reference point max to 1.0 to avoid NaN in regressed bbox · 0a38f8c8

Zhicheng Yan authored Sep 01, 2021

Summary:
For training DF-DETR with swin-transformer backbone which uses large size_divisibility 224 (=32 * 7) and potentially has more zero-padding, we find the regressed box can contain NaN values and fail the assertion here (https://fburl.com/code/p27ztcce).

This issue might be caused by two potential reasons.
- Fix 1. In DF-DETR encoder, the reference points prepared by `get_reference_points()` can contain normalized x,y coordinates larger than 1 due to the rounding issues during mask interpolation across feature scales (specific examples can be given upon request LoL). Thus, we clamp max of x,y coordinates to 1.0.

- Fix 2. The MLP used in bbox_embed heads contains 3 FC layers, which might be too many. We introduce an argument `BBOX_EMBED_NUM_LAYERS` to allow users to configure the number of FC layers. This change is back-compatible.

Reviewed By: zhanghang1989

Differential Revision: D30661167

fbshipit-source-id: c7e94983bf1ec07426fdf1b9d363e5163637f21a

0a38f8c8

31 Aug, 2021 2 commits

use TrackingAdapter for BoltNnExport · ecbe3e02

Yanghan Wang authored Aug 31, 2021

Differential Revision: D30615605

fbshipit-source-id: d4d4550b6d1da4c75945ba674fbdd49a842ec6a9

ecbe3e02

enable (fake) inference for bolt exported model · e62c0e4c

Yanghan Wang authored Aug 31, 2021

Summary:
Enable the inference for boltnn (via running torchscript).
- merge rcnn's boltnn test with other export types.
- misc fixes.

Differential Revision: D30610386

fbshipit-source-id: 7b78136f8ca640b5fc179cb47e3218e709418d71

e62c0e4c