Commits · 33ca49acb214c89b9dd7c9a5fb538553dc4736bc · OpenDAS / d2go

02 Jun, 2022 1 commit

better cleanup for disk_cache · 33ca49ac

Yanghan Wang authored Jun 02, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/267

https://www.internalfb.com/intern/test/281475036363610?ref_report_id=0 is flaky, which is caused by running multiple tests at the same time, and clean up is not handled very well in that case.

Reviewed By: tglik

Differential Revision: D36787035

fbshipit-source-id: 6a478318fe011af936dd10fa564519c8c0615ed3

33ca49ac

29 Apr, 2022 1 commit

add __init__ to d2go.quantization · b117baf1

Yanghan Wang authored Apr 28, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/228

This diff solves https://github.com/facebookresearch/d2go/issues/226

Reviewed By: tglik

Differential Revision: D36026321

fbshipit-source-id: 216b0bf7bc48c45deb093c238d70de2b40bc37a3

b117baf1

19 Apr, 2022 1 commit

apply import merging for fbcode/mobile-vision/d2go (3 of 4) · ae2f2f64

Lisa Roach authored Apr 19, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/212

Applies new import merging and sorting from µsort v1.0.

When merging imports, µsort will make a best-effort to move associated
comments to match merged elements, but there are known limitations due to
the diynamic nature of Python and developer tooling. These changes should
not produce any dangerous runtime changes, but may require touch-ups to
satisfy linters and other tooling.

Note that µsort uses case-insensitive, lexicographical sorting, which
results in a different ordering compared to isort. This provides a more
consistent sorting order, matching the case-insensitive order used when
sorting import statements by module name, and ensures that "frog", "FROG",
and "Frog" always sort next to each other.

For details on µsort's sorting and merging semantics, see the user guide:
https://usort.readthedocs.io/en/stable/guide.html#sorting

Reviewed By: jreese, wat3rBro

Differential Revision: D35559673

fbshipit-source-id: feeae2465ac2b62c44a0e92dc566e9a386567c9d

ae2f2f64

05 Apr, 2022 1 commit

refactor create_fake_detection_data_loader · 312c6b62

Yanghan Wang authored Apr 04, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/199

- `create_fake_detection_data_loader` currently doesn't take `cfg` as input, sometimes we need to test the augmentation that needs more complicated different cfg.
- name is a bit bad, rename it to `create_detection_data_loader_on_toy_dataset`.
- width/height were the resized size previously, we want to change it to the size of data source (image files) and use `cfg` to control resized size.

Update V3:
In V2 there're some test failures, the reason is that V2 is building data loader (via GeneralizedRCNN runner) using actual test config instead of default config before this diff + dataset name change. In V3 we uses the test's runner instead of default runner for the consistency. This reveals some real bugs that we didn't test before.

Reviewed By: omkar-fb

Differential Revision: D35238890

fbshipit-source-id: 28a6037374e74f452f91b494bd455b38d3a48433

312c6b62

04 Mar, 2022 3 commits

delay import for discache · d3115faf

Yanghan Wang authored Mar 04, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/185

The `DiskCachedDatasetFromList` was originally in the `d2go/data/utils.py`, so the class is declared by default. Therefore the clean up call (https://fburl.com/code/cu7hswhx) is always called even when the feature is not enabled. This diff move it to a new place and delay the import, so the clean up won't run.

Reviewed By: tglik

Differential Revision: D34601363

fbshipit-source-id: 734bb9b2c7957d7437ad40c4bfe60a441ec2f23a

d3115faf

add option to filter empty annotations · d369931a

Sam Tsai authored Mar 04, 2022

Summary: Add option for controlling empty annotation filtering.

Reviewed By: zhanghang1989

Differential Revision: D34365265

fbshipit-source-id: 261c6879636f19138de781098f47dee4909de9e7

d369931a

refactored extended coco · cb41f780

Sam Tsai authored Mar 04, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/179

Refactored extended coco to fix lint errors and also simpler error reporting.

Differential Revision: D34365252

fbshipit-source-id: 8bf221eba5b8c5e63ddcf5ca19d7486726aff797

cb41f780

25 Feb, 2022 1 commit

add option to use disk cache to store underlying dataset · 87374efb

Yanghan Wang authored Feb 24, 2022

Summary:
# TLDR: To use this feature, setting `D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)GO_DATA.DATASETS.DISK_CACHE.ENABLED` to `True`.

To support larger datasets, one idea is to offload the DatasetFromList from RAM to disk to avoid OOM. `DiskCachedDatasetFromList` is a drop-in replacement for `DatasetFromList`, during `__init__`, it puts serialized list onto the disk and only stores the mapping in the RAM (the mapping could be represented by a list of addresses or even just a single number, eg. every N item is grouped together and N is the fixed number), then the `__getitem__` reads data from disk and deserializes the element. Some more details:
- Originally the RAM cost is `O(s*G*N)` where `s` is average data size, `G` is #GPUs, `N` is dataset size. When diskcache is enabled, depending on the type of mapping, the final RAM cost is constant or O(N) with a very small coefficient; the final disk cost is `O(s*N)`.
- The RAM usage is peaked at preparing stage, the cost is `O(s*N)`, if this becomes bottleneck, we probably need to think about modifying the data loading function (registered in DatasetCatalog). We also change the data loading function to only run on local master process, otherwise RAM will be peaked at `O(s*G*N)` if all processes are loading data at the same time.
- The time overhead of initialization is linear to dataset size, this is capped by disk I/O speed and performance of diskcache library. Benchmark shows it can at least handle 1GB per minute if writing in chucks (much worse if not), which should be fine in most use cases.
- There're also a bit time overhead when reading the data, but this is usually negligible compared with reading files from external storage like manifold.

It's not very easy to integrate this into D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)/D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go cleanly without patching the code, several approaches:
- Integrate into D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8) directly (modifying D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)'s `DatasetFromList` and `get_detection_dataset_dicts`): might be the cleanest way, but D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8) doesn't depend on `diskcache` and this is a bit experimental right now.
- D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go uses its own version of [_train_loader_from_config](https://fburl.com/code/0gig5tj2) that wraps the returned `dataset`. It has two issues: 1): it's hard to make the underlying `get_detection_dataset_dicts` only run on local master, partly because building sampler uses `comm.shared_random_seed()`, things can easily go out-of -sync 2): needs some duplicated code for test loader.
- pass new arguments along the way, it requires touching D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)'s code as well, and we need to carry new arguments in lot of places.

Lots of TODOs:
- Automatically enable this when dataset is larger than certain threshold (need to figure out how to do this in multiple GPUs, some communication is needed if only local master is reading the dataset).
- better cleanups
- figure out the best way of integrating this (patching is a bit hacky) into D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)/D2 (https://github.com/facebookresearch/d2go/commit/7992f91324aee6ae59795063a007c6837e60cdb8)Go.
- run more benchmarks
- add unit test (maybe also enable integration tests using 2 nodes 2 GPUs for distributed settings)

Reviewed By: sstsai-adl

Differential Revision: D27451187

fbshipit-source-id: 7d329e1a3c3f9ec1fb9ada0298a52a33f2730e15

87374efb

23 Feb, 2022 1 commit

support using specified registration function for adhoc datasets · 7778f667

Sam Tsai authored Feb 23, 2022

Summary:
Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/61

Pull Request resolved: https://github.com/facebookresearch/d2go/pull/177

Adhoc datasets currently use default register functions. Changed to checking if it was registered in a look up table for injected coco and just using that instead.

Differential Revision: D33489049

fbshipit-source-id: bcb12bba49749a875ea80ae61f4eecc4a5d1e31a

7778f667

22 Dec, 2021 1 commit

registry and copy keys for extended coco load · bfd78461

Sam Tsai authored Dec 22, 2021

Summary:
1. Add registry for coco injection to allow for easier overriding of cococ injections
2. Coco loading currently is limited to certain keys. Adding option to allow for copying certain keys from the outputs.

Reviewed By: zhanghang1989

Differential Revision: D33132517

fbshipit-source-id: 57ac4994a66f9c75457cada7e85fb15da4818f3e

bfd78461

09 Sep, 2021 1 commit

enable black for mobile-vision · 82295dbf

Yanghan Wang authored Sep 08, 2021

Summary:
https://fb.workplace.com/groups/pythonfoundation/posts/2990917737888352

Remove `mobile-vision` from opt-out list; leaving `mobile-vision/SNPE` opted out because of 3rd-party code.

arc lint --take BLACK --apply-patches --paths-cmd 'hg files mobile-vision'

allow-large-files

Reviewed By: sstsai-adl

Differential Revision: D30721093

fbshipit-source-id: 9e5c16d988b315b93a28038443ecfb92efd18ef8

82295dbf

25 Jun, 2021 1 commit

use src dataset name instead of the derived class name · d4aedb83

Sam Tsai authored Jun 25, 2021

Summary: "@ [0-9]classes" is appended to datasets to mark whether it is a derived class of the original one and saved as a config. When reloading the config, the derived class name will be used as the source instead of the original source. Adding a check to remove the derived suffix.

Reviewed By: wat3rBro

Differential Revision: D29315132

fbshipit-source-id: 0cc204d305d2da6c9f1817aaf631270bd874f90d

d4aedb83

16 Jun, 2021 1 commit

add check/filter for invalid bounding boxes · 692a4fb3

Sam Tsai authored Jun 15, 2021

Summary: Checks for invalid bounding boxes and removes from the being included.

Reviewed By: wat3rBro

Differential Revision: D28902711

fbshipit-source-id: 1f017d6ccf5c959059bcb94a09ddd81de868feed

692a4fb3

21 May, 2021 1 commit

adding bounding box only options · 27bef8e3

Sam Tsai authored May 20, 2021

Summary: Option to change only bounding boxes, others remain the same.

Differential Revision: D28339388

fbshipit-source-id: 7a6d4c5153cf10c473992119f4c684e0b9159b44

27bef8e3

07 May, 2021 1 commit

hide caffe2 related code from oss · 18dc1374

Hang Zhang authored May 07, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/59

* We have an internal dependency:
```
d2go/export/logfiledb.py", line 8, in <module>
    from mobile_cv.torch.utils_caffe2.ws_utils import ScopedWS
    ModuleNotFoundError: No module named 'mobile_cv.torch'
```
This cause the failure of unittest on GitHub
https://github.com/facebookresearch/d2go/pull/58/checks?check_run_id=2471727763

* use python 3.8 because another unittest failure on github ci
```
from typing import final
ImportError: cannot import name 'final' from 'typing' (/usr/share/miniconda/lib/python3.7/typing.py)
```

Reviewed By: wat3rBro

Differential Revision: D28109444

fbshipit-source-id: 95e9774bdaa94f622267aeaac06d7448f37a103f

18dc1374

05 May, 2021 1 commit

add enlarge bounging box manipulation · e1961ad4

Sam Tsai authored May 05, 2021

Summary: Add a bounding manipulation tool to padding bounding box data.

Reviewed By: newstzpz

Differential Revision: D28082071

fbshipit-source-id: f168cae48672c4fa5c4ec98697c57ed7833787ab

e1961ad4

30 Apr, 2021 1 commit

add keypoints metadata registry · 77ebe09f

Sam Tsai authored Apr 29, 2021

Summary:
1. Add a keypoint metadata registry for registering different keypoint metadata
2. Add option to inject_coco_dataset for adding keypoint metadata

Reviewed By: newstzpz

Differential Revision: D27730541

fbshipit-source-id: c6ba97f60664fce4dcbb0de80222df7490bc6d5d

77ebe09f

15 Apr, 2021 1 commit

reduce memory usage and speed up TestToolsExporter · fb3ba095

Yanghan Wang authored Apr 14, 2021

Reviewed By: zhanghang1989

Differential Revision: D27783989

fbshipit-source-id: f05c11e396a2f62366721b365929b29f05d5bc02

fb3ba095

30 Mar, 2021 1 commit

reorganize unit tests · a0658c4a

Sam Tsai authored Mar 30, 2021

Summary: Separate unit tests into individual folder based on functionality.

Reviewed By: wat3rBro

Differential Revision: D27132567

fbshipit-source-id: 9a8200be530ca14c7ef42191d59795b05b9800cc

a0658c4a