Commits · 2ffcc5acd69fef69dd0ea7a6779d45db33583099 · OpenDAS / d2go

05 Mar, 2022 1 commit

Fix test after optimization tracking · 2ffcc5ac

Ananth Subramaniam authored Mar 04, 2022

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/188

Reviewed By: tangbinh, wat3rBro

Differential Revision: D34658350

fbshipit-source-id: 36e8c1e8c5dab97990b1d9a5b1a58667e0e3c455

2ffcc5ac

04 Mar, 2022 4 commits

Refactor codebase to use `trainer.loggers` over `trainer.logger` when needed (#11920) · c24d9370

Binh Tang authored Mar 04, 2022

Summary:
### New commit log messages
- [7e2f9fbad Refactor codebase to use `trainer.loggers` over `trainer.logger` when needed (#11920)](https://github.com/PyTorchLightning/pytorch-lightning/pull/11920)

Reviewed By: edward-io

Differential Revision: D34583686

fbshipit-source-id: 98e557b761555c24ff296fff3ec6881d141fa777

c24d9370

delay import for discache · d3115faf

Yanghan Wang authored Mar 04, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/185

The `DiskCachedDatasetFromList` was originally in the `d2go/data/utils.py`, so the class is declared by default. Therefore the clean up call (https://fburl.com/code/cu7hswhx) is always called even when the feature is not enabled. This diff move it to a new place and delay the import, so the clean up won't run.

Reviewed By: tglik

Differential Revision: D34601363

fbshipit-source-id: 734bb9b2c7957d7437ad40c4bfe60a441ec2f23a

d3115faf

add option to filter empty annotations · d369931a

Sam Tsai authored Mar 04, 2022

Summary: Add option for controlling empty annotation filtering.

Reviewed By: zhanghang1989

Differential Revision: D34365265

fbshipit-source-id: 261c6879636f19138de781098f47dee4909de9e7

d369931a

refactored extended coco · cb41f780

Sam Tsai authored Mar 04, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/179

Refactored extended coco to fix lint errors and also simpler error reporting.

Differential Revision: D34365252

fbshipit-source-id: 8bf221eba5b8c5e63ddcf5ca19d7486726aff797

cb41f780

03 Mar, 2022 1 commit

Integrate AIEnv with D2Go train_net · d8bdc633

Tsahi Glik authored Mar 02, 2022

Summary:
Add support in d2go.distributed for `env://` init method. Use env variables as specified in https://pytorch.org/docs/stable/distributed.html#environment-variable-initialization for initialized distributed params.

Also change train_net cli function signature to accept args list instead of only using `sys.argv`. To allow calling this function from AIEnv launcher.

Differential Revision: D34540275

fbshipit-source-id: 7f718aed4c010b0ac8347d43b5ca5b401210756c

d8bdc633

01 Mar, 2022 1 commit

Allow Users to Disable the Evaluation after the Last Training Iteration · f16cc060

Tong Xiao authored Feb 28, 2022

Summary:
`Detectron2GoRunner` defaults to trigger an evaluation right after the last iteration in the `runner.do_train` method. This sometimes might be unnecessary, because there is a `runner.do_test` at the end of training anyways.

It could also lead to some side effects. For example, it would cause the training and test data loader present at the same time, which led to an OOM issue in our use case.

In this diff, we add an option `eval_after_train` in the `EvalHook` to allow users to disable the evaluation after the last training iteration.

Reviewed By: wat3rBro

Differential Revision: D34295685

fbshipit-source-id: 3612eb649bb50145346c56c072ae9ca91cb199f5

f16cc060

28 Feb, 2022 2 commits

fix misc tests related to quantization · 02b00614

Yanghan Wang authored Feb 28, 2022

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/184

Reviewed By: zhanghang1989

Differential Revision: D34529248

fbshipit-source-id: f77882dae7de336da77ac9bb7c35cfc1e8d541af

02b00614

use the same lightning version on Github as in fbcode · afee4377

Yanghan Wang authored Feb 28, 2022

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/183

Reviewed By: zhanghang1989

Differential Revision: D34492204

fbshipit-source-id: 7fd459172e83a5015ca9eee0e2018ce8b22c3096

afee4377

25 Feb, 2022 1 commit

add option to use disk cache to store underlying dataset · 87374efb

Yanghan Wang authored Feb 24, 2022

Summary:
# TLDR: To use this feature, setting `D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)GO_DATA.DATASETS.DISK_CACHE.ENABLED` to `True`.

To support larger datasets, one idea is to offload the DatasetFromList from RAM to disk to avoid OOM. `DiskCachedDatasetFromList` is a drop-in replacement for `DatasetFromList`, during `__init__`, it puts serialized list onto the disk and only stores the mapping in the RAM (the mapping could be represented by a list of addresses or even just a single number, eg. every N item is grouped together and N is the fixed number), then the `__getitem__` reads data from disk and deserializes the element. Some more details:
- Originally the RAM cost is `O(s*G*N)` where `s` is average data size, `G` is #GPUs, `N` is dataset size. When diskcache is enabled, depending on the type of mapping, the final RAM cost is constant or O(N) with a very small coefficient; the final disk cost is `O(s*N)`.
- The RAM usage is peaked at preparing stage, the cost is `O(s*N)`, if this becomes bottleneck, we probably need to think about modifying the data loading function (registered in DatasetCatalog). We also change the data loading function to only run on local master process, otherwise RAM will be peaked at `O(s*G*N)` if all processes are loading data at the same time.
- The time overhead of initialization is linear to dataset size, this is capped by disk I/O speed and performance of diskcache library. Benchmark shows it can at least handle 1GB per minute if writing in chucks (much worse if not), which should be fine in most use cases.
- There're also a bit time overhead when reading the data, but this is usually negligible compared with reading files from external storage like manifold.

It's not very easy to integrate this into D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)/D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go cleanly without patching the code, several approaches:
- Integrate into D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8) directly (modifying D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)'s `DatasetFromList` and `get_detection_dataset_dicts`): might be the cleanest way, but D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8) doesn't depend on `diskcache` and this is a bit experimental right now.
- D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go uses its own version of [_train_loader_from_config](https://fburl.com/code/0gig5tj2) that wraps the returned `dataset`. It has two issues: 1): it's hard to make the underlying `get_detection_dataset_dicts` only run on local master, partly because building sampler uses `comm.shared_random_seed()`, things can easily go out-of -sync 2): needs some duplicated code for test loader.
- pass new arguments along the way, it requires touching D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)'s code as well, and we need to carry new arguments in lot of places.

Lots of TODOs:
- Automatically enable this when dataset is larger than certain threshold (need to figure out how to do this in multiple GPUs, some communication is needed if only local master is reading the dataset).
- better cleanups
- figure out the best way of integrating this (patching is a bit hacky) into D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)/D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go.
- run more benchmarks
- add unit test (maybe also enable integration tests using 2 nodes 2 GPUs for distributed settings)

Reviewed By: sstsai-adl

Differential Revision: D27451187

fbshipit-source-id: 7d329e1a3c3f9ec1fb9ada0298a52a33f2730e15

87374efb

24 Feb, 2022 1 commit

exclude d2go&project lib from .gitignore · fb0164c3

Yanghan Wang authored Feb 24, 2022

Summary: It's possible to have `lib` under core `mobile-vision/d2go/{d2go,projects}`, exclude them from `.gitignore`.

Reviewed By: zhanghang1989

Differential Revision: D34288538

fbshipit-source-id: 7094cdf4f52263fbf6ff6707d487bc3328fbbd8b

fb0164c3

23 Feb, 2022 3 commits

Replace deprecated DDP accelerator with ddp_find_unused_parameters_false · eb54efa2

Binh Tang authored Feb 23, 2022

Summary: We proactively remove references to the deprecated DDP accelerator to prepare for the breaking changes following the release of PyTorch Lighting 1.6 (see T112240890).

Differential Revision: D34295318

fbshipit-source-id: 7b2245ca9c7c2900f510722b33af8d8eeda49919

eb54efa2

support using specified registration function for adhoc datasets · 7778f667

Sam Tsai authored Feb 23, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/61

Pull Request resolved: https://github.com/facebookresearch/d2go/pull/177

Adhoc datasets currently use default register functions. Changed to checking if it was registered in a look up table for injected coco and just using that instead.

Differential Revision: D33489049

fbshipit-source-id: bcb12bba49749a875ea80ae61f4eecc4a5d1e31a

7778f667

lightning - deprecating distributed backend, switching to use_ddp · 00409af8

Sam Tsai authored Feb 22, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/180

Distributed backend is deprecated. Switching to use "use_ddp" instead.

Reviewed By: kazhang

Differential Revision: D34394993

fbshipit-source-id: a5bfb22f8952d20c9a8d86322cd740534c25c689

00409af8

14 Feb, 2022 1 commit

D2Go Fail Fast: Move exception coming from not implemented "compare accuracy" feature to the top. · eee4dfc1

Tugrul Savran authored Feb 14, 2022

Summary:
Currently, the exporter method takes in a compare_accuracy parameter, which after all the compute (exporting etc.) raises an exception if it is set to True.

This looks like an antipattern, and causes a waste of compute.

Therefore, I am proposing to raise the exception at the very beginning of method call to let the client know in advance that this argument's functionality isn't implemented yet.

NOTE: We might also choose to get rid of the entire parameter. I am open for suggestions.

Differential Revision: D34186578

fbshipit-source-id: d7fbe7589dfe2d2f688b870885ca61e6829c9329

eee4dfc1

11 Feb, 2022 1 commit

add inference path · 614336e4

Yanghan Wang authored Feb 11, 2022

Reviewed By: Maninae

Differential Revision: D34097529

fbshipit-source-id: e3c860bb2374e694fd6ae54651a479c2398b2462

614336e4

10 Feb, 2022 1 commit

set is_qat properly when fusing model · ac7be4fa

Yanghan Wang authored Feb 09, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/175

D33833203 adds `is_qat` argument to the fuser method, more details in https://fb.workplace.com/groups/2322282031156145/permalink/5026297484087906/. As results, MV's `fuse_utils.fuse_model` then becomes two functions: the original one is for non-qat; a new one `fuse_utils.fuse_model_qat` is for qat.

For D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go in most cases, `is_qat` can be inferred from `cfg.QUANTIZATION.QAT.ENABLED`, therefore we can extend the `fuse_model` to also take `is_qat` as parameter, and set it accordingly.

This diff updates all the call sites which is covered by unit tests. Those call sites include:
- default quantization APIs in d2go/modeling/quantization.py
- customized quantization APIs from individual meta-arch
- unit test itself

Reviewed By: tglik, jerryzh168

Differential Revision: D34112650

fbshipit-source-id: 026c309f603bee71d887e39aa4efee6477db731b

ac7be4fa

07 Feb, 2022 1 commit

DETR Model Export · 5aadaaa4

Hang Zhang authored Feb 07, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/169

Make d2go DETR exportable (torchscript compatible)
Move generating masks to preprocessing

Reviewed By: sstsai-adl

Differential Revision: D33798073

fbshipit-source-id: d629b0c9cbdb67060982be717c7138a0e7e9adbc

5aadaaa4

03 Feb, 2022 1 commit

Decouple utilities from `LightningLoggerBase` (#11484) · 6791682f

Ning Li (Seattle) authored Feb 03, 2022

Summary:
### New commit log messages
- [115a5d08e Decouple utilities from `LightningLoggerBase` (#11484)](https://github.com/PyTorchLightning/pytorch-lightning/pull/11484)

Reviewed By: tangbinh, wat3rBro

Differential Revision: D33960185

fbshipit-source-id: 6be72ad49f8433be6f238b36aa82d3f1b655e6f0

6791682f

02 Feb, 2022 1 commit

fbcode/mobile-vision · 2f0e1c92

Steven Troxler authored Feb 02, 2022

Summary:
Convert type comments in fbcode/mobile-vision

Produced by running:
```
python -m  libcst.tool codemod convert_type_comments.ConvertTypeComment fbcode/mobile-vision
```
from fbsource.

See
https://fb.workplace.com/groups/pythonfoundation/permalink/3106231549690303/

Reviewed By: grievejia

Differential Revision: D33897026

fbshipit-source-id: e7666555e47a9abc769975f6db6b2e6eda792d72

2f0e1c92

29 Jan, 2022 1 commit

Parse SuperNet search space from config · a1b1bf94

Tsahi Glik authored Jan 28, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/168

Add a hook in D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go config for custom parsing so we can support custom objects in D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go config like the search space objects.
Then adding SuperNet custom config processing to parse search space from arch_def when supernet is enabled, so it can be used in D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go SuperNet training.

This is an alternative approach to D33191150. In this approach we parse the entire architecture as a search space which will not have the limitations that we have in parsing only the dynamic blocks parts.

Reviewed By: zhanghang1989

Differential Revision: D33793423

fbshipit-source-id: 8acf5c5afb3c5c0005bdb0ca16847026e1b45e2c

a1b1bf94

27 Jan, 2022 3 commits

Remove Pre-norm option, since it is not used · 6994f168

Hang Zhang authored Jan 26, 2022

Summary: As in the tittle

Reviewed By: XiaoliangDai

Differential Revision: D33413849

fbshipit-source-id: b891849c175edc7b8916bff2fcc40c76c4658f14

6994f168

Enable Learnable Query TGT · 9200cbe8

Hang Zhang authored Jan 26, 2022

Summary: Learnable query doesn't improve the results, but it helps DETR with reference points in D33420993

Reviewed By: XiaoliangDai

Differential Revision: D33401417

fbshipit-source-id: 5296f2f969c04df18df292d61a7cf57107bc9b74

9200cbe8

Refactor Code Base · 4985ef73

Hang Zhang authored Jan 26, 2022

Summary: Add DETR_MODEL_REGISTRY registry to better support different variant of DETR (in later diff).

Reviewed By: newstzpz

Differential Revision: D32874194

fbshipit-source-id: f8e9a61417ec66bec9f2d98631260a2f4e2af4cf

4985ef73

20 Jan, 2022 1 commit

fix pickling issue in EnlargeBoundingBox · 189d83d7

Sam Tsai authored Jan 20, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/166

Pickling of transform functions seems to have changed (did not dig into it) in December, breaking the support for this augmentation. This error happens when training with multiple dataloaders. Using partial functions instead.

Differential Revision: D33665177

fbshipit-source-id: 4dfd41b92f3a6fea549b6e7a79bf0bf14a3cceaa

189d83d7

18 Jan, 2022 1 commit

Fix type signature of create_runner · c74e23b0

Miquel Jubert Hermoso authored Jan 18, 2022

Summary: The type signature of create_runner is not accurate. We expect lightning runners to follow DefaultTask. Also change setup.py to not import directly, which was causing circular dependencies together with the change.

Reviewed By: wat3rBro

Differential Revision: D32792069

fbshipit-source-id: 0fbb55eb269dd681dbc8df49d71c9635f56293b8

c74e23b0

14 Jan, 2022 1 commit

support multiple image visualization in dataloader visualization wrapper · 9c877fd4

Sam Tsai authored Jan 13, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/160

If the returned object of visualize_train_input is a dictionary, use the key as tag suffix and the values as separate output images.

Reviewed By: zhanghang1989, wat3rBro

Differential Revision: D33468573

fbshipit-source-id: b0a47ba312ff59700534e917c62af1dfa83dd5be

9c877fd4

13 Jan, 2022 2 commits

Add support for custom training step via meta_arch · b6e244d2

Tsahi Glik authored Jan 13, 2022

Summary:
Add support in the default lightning task to run a custom training step from Meta Arch if exists.
The goal is to allow custom training step without the need to inherit from the default lightning task class and override it. This will allow us to use a signle lightning task and still allow users to customize the training step. In the long run this will be further encapsulated in modeling hook, making it more modular and compositable with other custom code.

This change is a follow up from discussion in https://fburl.com/diff/yqlsypys

Reviewed By: wat3rBro

Differential Revision: D33534624

fbshipit-source-id: 560f06da03f218e77ad46832be9d741417882c56

b6e244d2

Person segmentation using torch lightning · c687fb83

Tsahi Glik authored Jan 12, 2022

Summary:
Add option to train Person Instance Segmentation using lightning instead of D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8).
This is needed because we want to try PIS with SuperNet and our SuperNet based training is implemented in d2go lightning task

Reviewed By: zhanghang1989

Differential Revision: D33281437

fbshipit-source-id: e1b6567f3c77ce51240fb50d81350bc97735713a

c687fb83

12 Jan, 2022 1 commit

workaround the quantization for FPN · 9d649b1e

Yanghan Wang authored Jan 11, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/163

Make quantizing FPN work, note that this is not a proper fix, which might be making pytorch picking the D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)'s Conv2d, and we need to revert this diff if it's supported.

Differential Revision: D33523917

fbshipit-source-id: 3d00f540a9fcb75a34125c244d86263d517a359f

9d649b1e

10 Jan, 2022 1 commit

Updated scaling rules for base_lr_end and quantization. · 02ecf002

Peizhao Zhang authored Jan 10, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/161

Updated scaling rules for base_lr_end and quantization.

Reviewed By: zhanghang1989, wat3rBro

Differential Revision: D33292860

fbshipit-source-id: c7a8747c8fb1f894d3c5508bbd607b3d1ef3d400

02ecf002

08 Jan, 2022 2 commits

Add deprecation path for renamed training type plugins (#11227) · fcd51171

Binh Tang authored Jan 08, 2022

Summary:
### New commit log messages
  4eede7c30 Add deprecation path for renamed training type plugins (#11227)

Reviewed By: edward-io, daniellepintz

Differential Revision: D33409991

fbshipit-source-id: 373e48767e992d67db3c85e436648481ad16c9d0

fcd51171

add unit test for visualization module · 0c269744

Sam Tsai authored Jan 08, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/158

Add unit tests for visualization wrapper and dataloader visualization wrapper.

Reviewed By: zhanghang1989, wat3rBro

Differential Revision: D33457734

fbshipit-source-id: e5f946ae4ee711a0914d8ac65b96cac40e7ab13b

0c269744

07 Jan, 2022 1 commit

Fix EMA model training with lightning · 6cff7737

Tsahi Glik authored Jan 07, 2022

Summary:
Current implementation of d2go lightning default task fails when running a model training with EMA.
The error is :
```
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss.
```
The error is due the fact the d2go lightning task create a copy of the ema model for evaluation that does not included in the training, which raise the error that there are unused params.
This is solved by moving the copy creation to after training and to when evaluation starts.

Reviewed By: kazhang

Differential Revision: D33442690

fbshipit-source-id: e9e469e33811de0b4171a64293cc16a8157af08c

6cff7737

06 Jan, 2022 1 commit

Rename `DDPPlugin` to `DDPStrategy` (#11142) · aeb15613

Binh Tang authored Jan 05, 2022

Summary:
### New commit log messages
  b64dea9dc Rename `DDPPlugin` to `DDPStrategy` (#11142)

Reviewed By: jjenniferdai

Differential Revision: D33259306

fbshipit-source-id: b4608c6b96b4a7977eaa4ed3f03c4b824882aef0

aeb15613

05 Jan, 2022 1 commit

Try LSJ on Faster RCNN with FBNet · 21ae9538

Hang Zhang authored Jan 05, 2022

Summary: Try LSJ with Faster RCNN with FBNet backbone

Reviewed By: newstzpz

Differential Revision: D32054932

fbshipit-source-id: 4fdb30e7b1258d6f167f2c2fd331209aad1b599a

21ae9538

30 Dec, 2021 2 commits

make model zoo usable internally · c12469c2

Yanghan Wang authored Dec 29, 2021

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/152

Reviewed By: zhanghang1989

Differential Revision: D31591900

fbshipit-source-id: 6ee8124419d535caf03532eda4f729e707b6dda7

c12469c2

update d2go_beginner.ipynb for exporting model · 06f3f2e8

Yanghan Wang authored Dec 29, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/155

- remove tracing, which shouldn't affect anything.
- create the model in cpu mode, since it might have issue casting model.

Reviewed By: zhanghang1989

Differential Revision: D33357269

fbshipit-source-id: 27a0330ebb12b993744dee47151c3056cd584ccf

06f3f2e8

29 Dec, 2021 2 commits

remove caffe2 from oss CI · 30cf78be

Yanghan Wang authored Dec 29, 2021

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/154

Reviewed By: zhanghang1989

Differential Revision: D33352204

fbshipit-source-id: e1a9ac6eb2574dfe6931435275e27c9508f66352

30cf78be

fix import error for DDPPlugin in oss · 62a97445

Yanghan Wang authored Dec 29, 2021

Summary: DDPPlugin has been renamed to DDPStrategy (as part of https://github.com/PyTorchLightning/pytorch-lightning/issues/10549), causing oss CI to fail. Simply skipping the import to unblock CI since DDP feature is not used in test.

Reviewed By: kazhang

Differential Revision: D33351636

fbshipit-source-id: 7a1881c8cd48d9ff17edd41137d27a976103fdde

62a97445