Commits · f6ce583ed94be6bd6abffb28579502f511a1b7bc · OpenDAS / d2go

20 Oct, 2021 2 commits

Peizhao Zhang authored Oct 20, 2021

Summary:
Supported learnable qat.
* Added a config key `QUANTIZATION.QAT.FAKE_QUANT_METHOD` to specify the qat metod (`default` or `learnable`).
* Added a config key `QUANTIZATION.QAT.ENABLE_LEARNABLE_OBSERVER_ITER` to specify the start iteration for learnable observers (before that it is using static observers).
* Custom quantization code needs to call ` d2go.utils.qat_utils.get_qat_qconfig()` to get proper qconfig for learnable qat. An exception will raise if qat method is learnable but no learnable observers are used in the model.
* Set the weight decay for scale/zero_point to 0 for the optimizer automatically.
* The way to use larnable qat: enable static observers -> enable fake quant -> enable learnable observers -> freeze bn.

Differential Revision: D31370822

fbshipit-source-id: a5a5044a539d0d7fe1cc6b36e6821fc411ce752a

f6ce583e

Refactored qat related code. · ef9c20cc

Peizhao Zhang authored Oct 20, 2021

Summary:
Refactored qat related code.
* Moved `_prepare_model_for_qat` related code to a function.
* Moved `_setup_non_qat_to_qat_state_dict_map` related code to a function.
* Moved QATHook related code to the quantization file and implemented as a class.

Differential Revision: D31370819

fbshipit-source-id: 836550b2c8d68cd93a84d5877ad9cef6f0f0eb39

ef9c20cc

15 Oct, 2021 2 commits

Supported specifying customized parameter groups from model. · 87ce583c

Peizhao Zhang authored Oct 14, 2021

Summary:
Supported specifying customized parameter groups from model.
* Allow model to specify customized parameter groups by implementing a function `model.get_optimizer_param_groups(cfg)`
* Supported model with ddp.

Reviewed By: zhanghang1989

Differential Revision: D31289315

fbshipit-source-id: c91ba8014508e9fd5f172601b9c1c83c188338fd

87ce583c

Refactor for get_optimizer_param_groups. · 2dc3bc02

Peizhao Zhang authored Oct 14, 2021

Summary:
Refactor for get_optimizer_param_groups.
* Split `get_default_optimizer_params()` into multiple functions:
  * `get_optimizer_param_groups_default()`
  * `get_optimizer_param_groups_lr()`
  * `get_optimizer_param_groups_weight_decay()`
* Regroup the parameters to create the minimal amount of groups.
* Print all parameter groups when the optimizer is created.
    Param group 0: {amsgrad: False, betas: (0.9, 0.999), eps: 1e-08, lr: 10.0, params: 1, weight_decay: 1.0}
    Param group 1: {amsgrad: False, betas: (0.9, 0.999), eps: 1e-08, lr: 1.0, params: 1, weight_decay: 1.0}
    Param group 2: {amsgrad: False, betas: (0.9, 0.999), eps: 1e-08, lr: 1.0, params: 2, weight_decay: 0.0}
* Add some unit tests.

Reviewed By: zhanghang1989

Differential Revision: D31287783

fbshipit-source-id: e87df0ae0e67343bb2130db945d8faced44d7411

2dc3bc02

06 Oct, 2021 1 commit

Update callsites in mobile-vision · fa24368f

Supriya Rao authored Oct 05, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/124

Update callsites from torch.quantization to torch.ao.quantization

Reviewed By: z-a-f, jerryzh168

Differential Revision: D31286125

fbshipit-source-id: ef24ca87d8db398c65bb5b89f035afe0423a5685

fa24368f

24 Sep, 2021 2 commits

Fix CI failure on github · 79ae5bbd

Hang Zhang authored Sep 24, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/117

Fix github ci failure due to lack of coco datset. It was cased by D31134064 (https://github.com/facebookresearch/d2go/commit/f018d4a7ceef437d8fc3ca8b2bba4b7321917e06)

Reviewed By: mattcyu1, wat3rBro

Differential Revision: D31179666

fbshipit-source-id: fe25129d167afcdcb577e5c8d82f3432ba939ca9

79ae5bbd

make exporting rcnn model using torchvision ops the default option · f018d4a7

Yanghan Wang authored Sep 23, 2021

Reviewed By: zhanghang1989

Differential Revision: D31134064

fbshipit-source-id: 825ca14477243a53f84b8521f4430a2b080324bd

f018d4a7

15 Sep, 2021 1 commit

Fix LR auto-scale for multi-tensor optimizers · b9aa4855

Valentin Andrei authored Sep 14, 2021

Reviewed By: stephenyan1231, zhanghang1989

Differential Revision: D30903817

fbshipit-source-id: 578e6b02a1bd59b1bd841399fc60111d320ae9aa

b9aa4855

09 Sep, 2021 1 commit

enable black for mobile-vision · 82295dbf

Yanghan Wang authored Sep 08, 2021

Summary:
https://fb.workplace.com/groups/pythonfoundation/posts/2990917737888352

Remove `mobile-vision` from opt-out list; leaving `mobile-vision/SNPE` opted out because of 3rd-party code.

arc lint --take BLACK --apply-patches --paths-cmd 'hg files mobile-vision'

allow-large-files

Reviewed By: sstsai-adl

Differential Revision: D30721093

fbshipit-source-id: 9e5c16d988b315b93a28038443ecfb92efd18ef8

82295dbf

31 Aug, 2021 1 commit

enable (fake) inference for bolt exported model · e62c0e4c

Yanghan Wang authored Aug 31, 2021

Summary:
Enable the inference for boltnn (via running torchscript).
- merge rcnn's boltnn test with other export types.
- misc fixes.

Differential Revision: D30610386

fbshipit-source-id: 7b78136f8ca640b5fc179cb47e3218e709418d71

e62c0e4c

18 Aug, 2021 2 commits

torch batch boundary CE loss · 7ae35eec

Siddharth Shah authored Aug 18, 2021

Summary:
A torch version which is batched allows us to avoid CPU <--> GPU copy which
gets us ~200ms per iteration saving. This new version of generating boundary
weight mask produces identical masks.

Reviewed By: wat3rBro

Differential Revision: D30176412

fbshipit-source-id: 877f4c6337e7870d3bafd8eb9157ac166ddd588a

7ae35eec

Add multi-tensor optimizer version for SGD · 918abe42

Valentin Andrei authored Aug 18, 2021

Summary:
Added multi-tensor optimizer implementation for SGD, from `torch.optim._multi_tensor`. It can potentially provide ~5% QPS improvement by using `foreach` API to speed up the optimizer step.

Using it is optional, from the configuration file, by specifying `SGD_MT` in the `SOLVER.OPTIMIZER` setting.

Reviewed By: zhanghang1989

Differential Revision: D30377761

fbshipit-source-id: 06107f1b91e9807c1db5d1b0ca6be09fcbb13e67

918abe42

17 Aug, 2021 1 commit

remove uint8 cast in weight calculation for boundary CE loss and fix ZeroDivisionError · 9a9d53fb

Siddharth Shah authored Aug 16, 2021

Summary: The uint8 cast means that the floating point non_bd_weight is never assigned

Reviewed By: wat3rBro

Differential Revision: D30176377

fbshipit-source-id: 013602bb4313393f220ee0f1510bf1ff83bd56fc

9a9d53fb

16 Aug, 2021 1 commit

fbnas d2go initial integration for sampling-based search · 27f2b21d

Hang Zhang authored Aug 16, 2021

Summary: Add FBNAS toolkit for HPO in D2 (https://github.com/facebookresearch/d2go/commit/adf223bdac5b534514a8ba80f6bd61fc9dd8b464)Go

Reviewed By: newstzpz

Differential Revision: D28672821

fbshipit-source-id: 6a378af2bb43ef6cb556d4158fd1b0d3e363e956

27f2b21d

27 Jun, 2021 1 commit

enable flop printing & logging at the beginning of train & test · 5509a138

Yuxin Wu authored Jun 27, 2021

Reviewed By: zhanghang1989

Differential Revision: D29379832

fbshipit-source-id: 9283a8796a1dbee81b51611407c22f7d5a2069dc

5509a138

25 Jun, 2021 1 commit

use src dataset name instead of the derived class name · d4aedb83

Sam Tsai authored Jun 25, 2021

Summary: "@ [0-9]classes" is appended to datasets to mark whether it is a derived class of the original one and saved as a config. When reloading the config, the derived class name will be used as the source instead of the original source. Adding a check to remove the derived suffix.

Reviewed By: wat3rBro

Differential Revision: D29315132

fbshipit-source-id: 0cc204d305d2da6c9f1817aaf631270bd874f90d

d4aedb83

21 Jun, 2021 1 commit

additional flop counting using fvcore's flop counter · bc9d5070

Yuxin Wu authored Jun 21, 2021

Summary:
1. save 3 versions of flop count, using both mobile_cv's flop counter and fvcore's flop counter
2. print only a simple short table in terminal, but save others to files

The `print_flops` function seems not used anywhere so this diff just replaced it.

TODO: enable this feature automatically for train/eval workflows in the next diff

Reviewed By: zhanghang1989

Differential Revision: D29182412

fbshipit-source-id: bfa1dfad41b99fcda06b96c4732237b5e753f1bb

bc9d5070

16 Jun, 2021 1 commit

add check/filter for invalid bounding boxes · 692a4fb3

Sam Tsai authored Jun 15, 2021

Summary: Checks for invalid bounding boxes and removes from the being included.

Reviewed By: wat3rBro

Differential Revision: D28902711

fbshipit-source-id: 1f017d6ccf5c959059bcb94a09ddd81de868feed

692a4fb3

14 Jun, 2021 1 commit

add prepare_for_export for D2's SemanticSegmentor · 30fb79b6

Yanghan Wang authored Jun 14, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/83

- Implement `prepare_for_export` for `SemanticSegmentor`
- Add unit test comparing numerical matching

Reviewed By: zhanghang1989

Differential Revision: D29088421

fbshipit-source-id: ccb86ac4b4b90a63eeebdbf76b2bf31c1da65a8b

30fb79b6

01 Jun, 2021 1 commit

misc update to config utils · 81ab967f

Yanghan Wang authored Jun 01, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/77

- Reimplement `get_cfg_diff_table` by reusing other utils
- Adding `reorder` option for `flatten_config_dict`
- Remove the legacy BC support for `ARCH_DEF`, including `str_wrap_fbnet_arch_def` and customized `merge_from_other_cfg`.
- Move `temp_defrost` from `utils.py` to `config.py`, this way there's no more namespace forwarding for `utils.py`
- Merge `test_config_utils.py` and `test_configs.py`

Reviewed By: zhanghang1989

Differential Revision: D28734493

fbshipit-source-id: 925f5944cf0e9019e4c54462e851ea16a5c94b8c

81ab967f

25 May, 2021 2 commits

update RCNN model test base · 0ab6d3f1

Yanghan Wang authored May 25, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/75

Refactor the base test case
- make test_dir valid throughout the test (rather than under local context), so individual test can load back the export model
- refactor the `custom_setup_test` for easier override.
- move parameterized into base class to avoid copying naming function

Reviewed By: zhanghang1989

Differential Revision: D28651067

fbshipit-source-id: c59a311564f6114039e20ed3a23e5dd9c84f4ae4

0ab6d3f1

Read number of processes from dist_config · 29b57165

Kai Zhang authored May 24, 2021

Summary: Currently when launching a training flow, we read number of processes from resources.num_gpus. To be backward compatible with existing D2 (https://github.com/facebookresearch/d2go/commit/f82d44d3c33e6c781a3c6f2b27b376fdfbaeda53)Go training config, this diff changes to dist_config.num_processes_per_machine instead.

Reviewed By: wat3rBro

Differential Revision: D28630334

fbshipit-source-id: 3c684cd56e5d2e247c7b82e1d1eeff0f39e59ee4

29b57165

22 May, 2021 1 commit

Revert D27881742: Enable inference config in export step · daf37a84

Yanghan Wang authored May 21, 2021

Differential Revision:
D27881742 (https://github.com/facebookresearch/d2go/commit/90aff5daf608473dd312b300db8615326fa40a37)

Original commit changeset: 34a3ab7a88f4

fbshipit-source-id: 42c03b4f2b69c656b26774a4665b84b832262650

daf37a84

21 May, 2021 2 commits

Enable inference config in export step · 90aff5da

Sanjeev Kumar authored May 21, 2021

Summary:
- Enable sdk inference config specification in export step. This enables adding the sdk configuration as part of model file in the export step. The sdk config can be specified as infernece_config.yaml and is zipped together with torchscript model. The main goal of sdk configuration is to control the model inference behavior with model.
- SDK inference config design doc: https://docs.google.com/document/d/1j5qx8IrnFg1DJFzTnu4W8WmXFYJ-AgCDfSQHb2ACJsk/edit
- One click fblearner pipeline is in next diff on the stack

Differential Revision: D27881742

fbshipit-source-id: 34a3ab7a88f456b74841cf671ea1b3f678cdb733

90aff5da

adding bounding box only options · 27bef8e3

Sam Tsai authored May 20, 2021

Summary: Option to change only bounding boxes, others remain the same.

Differential Revision: D28339388

fbshipit-source-id: 7a6d4c5153cf10c473992119f4c684e0b9159b44

27bef8e3

17 May, 2021 1 commit

add dataset visualization · 536e9d25

Kai Zhang authored May 17, 2021

Summary: Add dataset visualization so that we could visualize test results in Tensorboard.

Reviewed By: zhanghang1989

Differential Revision: D28457363

fbshipit-source-id: 4c2fd9dce349c6fb9e1cec51c9138cf0abb45d7b

536e9d25

12 May, 2021 1 commit

Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to... · 0848c589

Luis Perez authored May 11, 2021

Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
# Manual
 - remove fixme's in `model_checkpoint.py`, `parameter_monitor.py`, `test_quantization.py`, and `speed_monitor.py` now that `Trainer` is properly annotated.
- update `test_quantization.py` to `trainer.train_loop.global_step` instead of `trainer.global_step` which is a read-only.
- update `loop_callback.py` to read from `train_loop` for `batch_idx` (which is no longer available).

# Automatic
### New commit log messages
  7b283e3c Bugfix/Multiple dataloaders (#7433)
  d7c44cc6 Docs: sync chlog 1.3.1 (#7478)
  fdf50a5e Mark certain Trainer APIs as protected (#7420)
  ad9118f0 remove trainer hidden state | sanity refactor [1 / n] (#7437)
  4a1134db Log epoch metrics before firing the `on_evaluation_end` hook (#7272)
  b65ae794 Automatically check `DataModule.has_{setup,teardown,prepare_data}` [2/2] (#7238)
  8660d8cf [pre-commit.ci] pre-commit autoupdate (#7475)
  f6fe715e Fix Sphinx argument deprecation (#7464)

Reviewed By: shuyingsunshine21

Differential Revision: D28353491

fbshipit-source-id: 98b87d99e2f09b47b07270858fcbdb5d5299730b

0848c589

07 May, 2021 1 commit

hide caffe2 related code from oss · 18dc1374

Hang Zhang authored May 07, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/59

* We have an internal dependency:
```
d2go/export/logfiledb.py", line 8, in <module>
    from mobile_cv.torch.utils_caffe2.ws_utils import ScopedWS
    ModuleNotFoundError: No module named 'mobile_cv.torch'
```
This cause the failure of unittest on GitHub
https://github.com/facebookresearch/d2go/pull/58/checks?check_run_id=2471727763

* use python 3.8 because another unittest failure on github ci
```
from typing import final
ImportError: cannot import name 'final' from 'typing' (/usr/share/miniconda/lib/python3.7/typing.py)
```

Reviewed By: wat3rBro

Differential Revision: D28109444

fbshipit-source-id: 95e9774bdaa94f622267aeaac06d7448f37a103f

18dc1374

05 May, 2021 1 commit

add enlarge bounging box manipulation · e1961ad4

Sam Tsai authored May 05, 2021

Summary: Add a bounding manipulation tool to padding bounding box data.

Reviewed By: newstzpz

Differential Revision: D28082071

fbshipit-source-id: f168cae48672c4fa5c4ec98697c57ed7833787ab

e1961ad4

04 May, 2021 1 commit

move some of `test_meta_arch_rcnn.py` to oss · e84d3414

Yanghan Wang authored May 04, 2021

Reviewed By: newstzpz

Differential Revision: D27747996

fbshipit-source-id: 6ae3b89c3944098828e246e5a4a89209b8e171a1

e84d3414

30 Apr, 2021 1 commit

add keypoints metadata registry · 77ebe09f

Sam Tsai authored Apr 29, 2021

Summary:
1. Add a keypoint metadata registry for registering different keypoint metadata
2. Add option to inject_coco_dataset for adding keypoint metadata

Reviewed By: newstzpz

Differential Revision: D27730541

fbshipit-source-id: c6ba97f60664fce4dcbb0de80222df7490bc6d5d

77ebe09f

28 Apr, 2021 1 commit

Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to... · a95c7983

Ananth Subramaniam authored Apr 27, 2021

Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
### New commit log messages
7fe8d184 Do not `shuffle` in `LightningDataModule.from_datasets` for `IterableDataset` (#7053)
bab72255 [fix] Add barriers before and after setup hook is run (#7202)
f920ba29 [bugfix] Metric not logged properly in manual optimization (#7228)
e147127c [feat] Add better support for predict + ddp 2/3 (#7215)
ca6c87ff Add back `clip_gradients(model)` (#7231)
3b36d81c Fixed `num_sanity_val_steps` affecting reproducibility of training data shuffling (#7014)
5cf9afa1 Add fairscale install msg for Sharded Plugins (#7213)
52a5cee0 Set smarter default for DDP sharded for performance optimization (#6937)
dd5ec75e Deprecate save_function from model checkpoint callback (#7201)
ac7d6a35 Fix `NeptuneLogger.log_text(step=None)` (#7194)
6be0a859 Update teardown for TPU acc (#7211)
bc3f08b0 [fix] Add barrier to accelerator's teardown (#6814)
68eac4d9 Enforce Lightning module as source of truth for automatic optimization (#7130)
44d775fc Update Error message for ProfileConnector (#7204)
31fcd7d0 Deprecate write_predictions on the LightningModule (#7066)
591b9cee make bug_report_model minimal (#7191)
b3fe8366 Move metrics_to_scalars to a dedicated utilities file (#7180)
f58865aa Properly set `LightningModule.device` after model replacement (#7188)
8439aead Update FairScale on CI (#7017)
92af3632 Fix `lr_finder` suggesting too high learning rates (#7076)
d534e53e add missing predict docs (#7150)

Reviewed By: kazhang

Differential Revision: D28032962

fbshipit-source-id: 18cd01e8ecc13fe25f0890ac0f4b20c3c3e1fed3

a95c7983

21 Apr, 2021 1 commit

disable logger in Lightning task test · 1a7f16bb

Kai Zhang authored Apr 20, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/46

As titled. The test is flaky because the tensorboard logger might still be writing to temporary folder when we tear down the folder.

Reviewed By: ananthsub

Differential Revision: D27844504

fbshipit-source-id: 3987f9ec3cd05b2f193e75cd4d85109a46f4ee71

1a7f16bb

20 Apr, 2021 1 commit

Remove e2e_mask_rcnn_fbnet_600_qat test · a3f4276c

Kai Zhang authored Apr 20, 2021

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/49

Reviewed By: wat3rBro

Differential Revision: D27875007

fbshipit-source-id: 2f61a4a3de29f3583a54adc914ee5a7eb605a823

a3f4276c

19 Apr, 2021 1 commit

Added hooks to report training progress to fblearner and keep alive. · bd6043ee

Peizhao Zhang authored Apr 19, 2021

Summary:
* Added a registry to register functions that could be used to register hooks for training.
  * TRAINER_HOOKS_REGISTRY: List of functions to add hooks for trainer, all functions in the registry will be called to add hooks
  * `func(hooks: List[HookBase]) -> None`

Reviewed By: zhanghang1989

Differential Revision: D27560806

fbshipit-source-id: fcfa02623bfd08508b6083db2d318d08f7e3c0b8

bd6043ee

17 Apr, 2021 1 commit

Delegate to model's customization · aeb24a92

Kai Zhang authored Apr 17, 2021

Summary: Delegate FX quantization callback's customization to model.

Reviewed By: wat3rBro

Differential Revision: D27669212

fbshipit-source-id: 2715546cf03134896da6f95ecddaf8503ff95d0b

aeb24a92

15 Apr, 2021 1 commit

reduce memory usage and speed up TestToolsExporter · fb3ba095

Yanghan Wang authored Apr 14, 2021

Reviewed By: zhanghang1989

Differential Revision: D27783989

fbshipit-source-id: f05c11e396a2f62366721b365929b29f05d5bc02

fb3ba095

14 Apr, 2021 1 commit

Synchronize PyTorchLightning/pytorch-lightning (revision 0b843848@master) to... · 44e41084

Ananth Subramaniam authored Apr 13, 2021

Synchronize PyTorchLightning/pytorch-lightning (revision 0b843848@master) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
### New commit log messages
## [UnReleased] - 2021-MM-DD

### Added

- Added more explicit exception message when trying to execute `trainer.test()` or `trainer.validate()` with `fast_dev_run=True` ([#6667](https://github.com/PyTorchLightning/pytorch-lightning/pull/6667))

- Added `LightningCLI` class to provide simple reproducibility with minimum boilerplate training cli. ([#4492](https://github.com/PyTorchLightning/pytorch-lightning/pull/4492))

- Trigger warning when non-metric logged value with multi processes hasn't been reduced ([#6417](https://github.com/PyTorchLightning/pytorch-lightning/pull/6417))

- Added `gradient_clip_algorithm` argument to Trainer for gradient clipping by value ([#6123](https://github.com/PyTorchLightning/pytorch-lightning/pull/6123)).

- Added a way to print to terminal without breaking up the progress bar ([#5470](https://github.com/PyTorchLightning/pytorch-lightning/pull/5470))

- Added support to checkpoint after training steps in `ModelCheckpoint` callback ([#6146](https://github.com/PyTorchLightning/pytorch-lightning/pull/6146))

- Added `checkpoint` parameter to callback's `on_save_checkpoint` hook ([#6072](https://github.com/PyTorchLightning/pytorch-lightning/pull/6072))

- Added `RunningStage.SANITY_CHECKING` ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Added `TrainerState.{FITTING,VALIDATING,TESTING,PREDICTING,TUNING}` ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Added `Trainer.validate()` method to perform one evaluation epoch over the validation set ([#4948](https://github.com/PyTorchLightning/pytorch-lightning/pull/4948))

- Added `LightningEnvironment` for Lightning-specific DDP ([#5915](https://github.com/PyTorchLightning/pytorch-lightning/pull/5915))

- Added `teardown()` hook to LightningDataModule ([#4673](https://github.com/PyTorchLightning/pytorch-lightning/pull/4673))

- Added `auto_insert_metric_name` parameter to `ModelCheckpoint` ([#6277](https://github.com/PyTorchLightning/pytorch-lightning/pull/6277))

- Added arg to `self.log` that enables users to give custom names when dealing with multiple dataloaders ([#6274](https://github.com/PyTorchLightning/pytorch-lightning/pull/6274))

- Added `teardown` method to `BaseProfiler` to enable subclasses defining post-profiling steps outside of `__del__` ([#6370](https://github.com/PyTorchLightning/pytorch-lightning/pull/6370))

- Added `setup` method to `BaseProfiler` to enable subclasses defining pre-profiling steps for every process ([#6633](https://github.com/PyTorchLightning/pytorch-lightning/pull/6633))

- Added no return warning to predict ([#6139](https://github.com/PyTorchLightning/pytorch-lightning/pull/6139))

- Added `Trainer.predict` config validation ([#6543](https://github.com/PyTorchLightning/pytorch-lightning/pull/6543))

- Added `AbstractProfiler` interface ([#6621](https://github.com/PyTorchLightning/pytorch-lightning/pull/6621))

- Added support for including module names for forward in the autograd trace of `PyTorchProfiler` ([#6349](https://github.com/PyTorchLightning/pytorch-lightning/pull/6349))

- Added support for the PyTorch 1.8.1 autograd profiler ([#6618](https://github.com/PyTorchLightning/pytorch-lightning/pull/6618))

- Added `outputs` parameter to callback's `on_validation_epoch_end` & `on_test_epoch_end` hooks ([#6120](https://github.com/PyTorchLightning/pytorch-lightning/pull/6120))

- Added `configure_sharded_model` hook ([#6679](https://github.com/PyTorchLightning/pytorch-lightning/pull/6679))

- Added support for `precision=64`, enabling training with double precision ([#6595](https://github.com/PyTorchLightning/pytorch-lightning/pull/6595))

- Added support for DDP communication hooks ([#6736](https://github.com/PyTorchLightning/pytorch-lightning/issues/6736))

- Added `artifact_location` argument to `MLFlowLogger` which will be passed to the `MlflowClient.create_experiment` call ([#6677](https://github.com/PyTorchLightning/pytorch-lightning/pull/6677))

- Added `model` parameter to precision plugins' `clip_gradients` signature ([#6764](https://github.com/PyTorchLightning/pytorch-lightning/pull/6764))

### Changed

- Renamed `pytorch_lightning.callbacks.swa` to `pytorch_lightning.callbacks.stochastic_weight_avg` ([#6259](https://github.com/PyTorchLightning/pytorch-lightning/pull/6259))

- Refactor `RunningStage` and `TrainerState` usage ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Changed `trainer.evaluating` to return `True` if validating or testing ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Changed `setup()` and `teardown()` stage argument to take any of `{fit,validate,test,predict}` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))

- Changed profilers to save separate report files per state and rank ([#6621](https://github.com/PyTorchLightning/pytorch-lightning/pull/6621))

- Changed `PyTorchProfiler` to use `torch.autograd.profiler.record_function` to record functions ([#6349](https://github.com/PyTorchLightning/pytorch-lightning/pull/6349))

### Deprecated

- `period` has been deprecated in favor of `every_n_val_epochs` in the `ModelCheckpoint` callback ([#6146](https://github.com/PyTorchLightning/pytorch-lightning/pull/6146))

- Deprecated `trainer.running_sanity_check` in favor of `trainer.sanity_checking` ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Deprecated `Profiler(output_filename)` in favor of `dirpath` and `filename` ([#6621](https://github.com/PyTorchLightning/pytorch-lightning/pull/6621))

- Deprecated `PytorchProfiler(profiled_functions)` in favor of `record_functions` ([#6349](https://github.com/PyTorchLightning/pytorch-lightning/pull/6349))

- Deprecated metrics in favor of `torchmetrics` ([#6505](https://github.com/PyTorchLightning/pytorch-lightning/pull/6505),
    [#6530](https://github.com/PyTorchLightning/pytorch-lightning/pull/6530),
    [#6540](https://github.com/PyTorchLightning/pytorch-lightning/pull/6540),
    [#6547](https://github.com/PyTorchLightning/pytorch-lightning/pull/6547),
    [#6515](https://github.com/PyTorchLightning/pytorch-lightning/pull/6515),
    [#6572](https://github.com/PyTorchLightning/pytorch-lightning/pull/6572),
    [#6573](https://github.com/PyTorchLightning/pytorch-lightning/pull/6573),
    [#6584](https://github.com/PyTorchLightning/pytorch-lightning/pull/6584),
    [#6636](https://github.com/PyTorchLightning/pytorch-lightning/pull/6636),
    [#6637](https://github.com/PyTorchLightning/pytorch-lightning/pull/6637),
    [#6649](https://github.com/PyTorchLightning/pytorch-lightning/pull/6649),
    [#6659](https://github.com/PyTorchLightning/pytorch-lightning/pull/6659),
)

### Removed

- Removed support for passing a bool value to `profiler` argument of Trainer ([#6164](https://github.com/PyTorchLightning/pytorch-lightning/pull/6164))

- Removed no return warning from val/test step ([#6139](https://github.com/PyTorchLightning/pytorch-lightning/pull/6139))

- Removed passing a `ModelCheckpoint` instance to `Trainer(checkpoint_callback)` ([#6166](https://github.com/PyTorchLightning/pytorch-lightning/pull/6166))

- Removed deprecated Trainer argument `enable_pl_optimizer` and `automatic_optimization` ([#6163](https://github.com/PyTorchLightning/pytorch-lightning/pull/6163))

- Removed deprecated metrics ([#6161](https://github.com/PyTorchLightning/pytorch-lightning/pull/6161))
    * from `pytorch_lightning.metrics.functional.classification` removed `to_onehot`, `to_categorical`, `get_num_classes`, `roc`, `multiclass_roc`, `average_precision`, `precision_recall_curve`, `multiclass_precision_recall_curve`
    * from `pytorch_lightning.metrics.functional.reduction` removed `reduce`, `class_reduce`

- Removed deprecated `ModelCheckpoint` arguments `prefix`, `mode="auto"` ([#6162](https://github.com/PyTorchLightning/pytorch-lightning/pull/6162))

- Removed `mode='auto'` from `EarlyStopping` ([#6167](https://github.com/PyTorchLightning/pytorch-lightning/pull/6167))

- Removed legacy references for magic keys in the `Result` object ([#6016](https://github.com/PyTorchLightning/pytorch-lightning/pull/6016))

- Removed deprecated `LightningModule` `hparams` setter ([#6207](https://github.com/PyTorchLightning/pytorch-lightning/pull/6207))

- Removed legacy code to log or include metrics in the progress bar by returning them in a dict with the `"log"/"progress_bar"` magic keys. Use `self.log` instead ([#6734](https://github.com/PyTorchLightning/pytorch-lightning/pull/6734))

- Removed `optimizer_idx` argument from `training_step` in manual optimization ([#6093](https://github.com/PyTorchLightning/pytorch-lightning/pull/6093))

### Fixed

- Set better defaults for `rank_zero_only.rank` when training is launched with SLURM and torchelastic ([#6802](https://github.com/PyTorchLightning/pytorch-lightning/pull/6802/))

- Made the `Plugin.reduce` method more consistent across all Plugins to reflect a mean-reduction by default ([#6011](https://github.com/PyTorchLightning/pytorch-lightning/pull/6011))

- Move lightning module to correct device type when using LightningDistributedWrapper ([#6070](https://github.com/PyTorchLightning/pytorch-lightning/pull/6070))

- Do not print top-k verbose log with `ModelCheckpoint(monitor=None)` ([#6109](https://github.com/PyTorchLightning/pytorch-lightning/pull/6109))

- Fixed csv extension check ([#6436](https://github.com/PyTorchLightning/pytorch-lightning/pull/6436))

- Fixed `ModelCheckpoint(monitor=None, save_last=True)` not saving checkpoints ([#6136](https://github.com/PyTorchLightning/pytorch-lightning/pull/6136))

- Fixed `ModelCheckpoint(save_top_k=0, save_last=True)` not saving the `last` checkpoint ([#6136](https://github.com/PyTorchLightning/pytorch-lightning/pull/6136))

- Fixed `.teardown(stage='fit')` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))

- Fixed `.on_fit_{start,end}()` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))

- Fixed LightningModule `all_gather` on cpu tensors ([#6416](https://github.com/PyTorchLightning/pytorch-lightning/pull/6416))

- Fixed torch distributed not available in setup hook for DDP ([#6506](https://github.com/PyTorchLightning/pytorch-lightning/pull/6506))

- Fixed `EarlyStopping` logic when `min_epochs` or `min_steps` requirement is not met ([#6705](https://github.com/PyTorchLightning/pytorch-lightning/pull/6705))

## [1.2.7] - 2021-04-06

### Fixed

- Fixed resolve a bug with omegaconf and xm.save ([#6741](https://github.com/PyTorchLightning/pytorch-lightning/pull/6741))
- Fixed an issue with IterableDataset when __len__ is not defined ([#6828](https://github.com/PyTorchLightning/pytorch-lightning/pull/6828))
- Sanitize None params during pruning ([#6836](https://github.com/PyTorchLightning/pytorch-lightning/pull/6836))
- Enforce an epoch scheduler interval when using SWA ([#6588](https://github.com/PyTorchLightning/pytorch-lightning/pull/6588))
- Fixed TPU Colab hang issue, post training ([#6816](https://github.com/PyTorchLightning/pytorch-lightning/pull/6816))
- Fixed a bug where `TensorBoardLogger` would give a warning and not log correctly to a symbolic link `save_dir` ([#6730](https://github.com/PyTorchLightning/pytorch-lightning/pull/6730))

## [1.2.6] - 2021-03-30

### Changed

- Changed the behavior of `on_epoch_start` to run at the beginning of validation & test epoch ([#6498](https://github.com/PyTorchLightning/pytorch-lightning/pull/6498))

### Removed

- Removed legacy code to include `step` dictionary returns in `callback_metrics`. Use `self.log_dict` instead. ([#6682](https://github.com/PyTorchLightning/pytorch-lightning/pull/6682))

### Fixed

- Fixed `DummyLogger.log_hyperparams` raising a `TypeError` when running with `fast_dev_run=True` ([#6398](https://github.com/PyTorchLightning/pytorch-lightning/pull/6398))
- Fixed error on TPUs when there was no `ModelCheckpoint` ([#6654](https://github.com/PyTorchLightning/pytorch-lightning/pull/6654))
- Fixed `trainer.test` freeze on TPUs ([#6654](https://github.com/PyTorchLightning/pytorch-lightning/pull/6654))
- Fixed a bug where gradients were disabled after calling `Trainer.predict` ([#6657](https://github.com/PyTorchLightning/pytorch-lightning/pull/6657))
- Fixed bug where no TPUs were detected in a TPU pod env ([#6719](https://github.com/PyTorchLightning/pytorch-lightning/pull/6719))

## [1.2.5] - 2021-03-23

### Changed

- Update Gradient Clipping for the TPU Accelerator ([#6576](https://github.com/PyTorchLightning/pytorch-lightning/pull/6576))
- Refactored setup for typing friendly ([#6590](https://github.com/PyTorchLightning/pytorch-lightning/pull/6590))

### Fixed

- Fixed a bug where `all_gather` would not work correctly with `tpu_cores=8` ([#6587](https://github.com/PyTorchLightning/pytorch-lightning/pull/6587))
- Fixed comparing required versions ([#6434](https://github.com/PyTorchLightning/pytorch-lightning/pull/6434))
- Fixed duplicate logs appearing in console when using the python logging module ([#6275](https://github.com/PyTorchLightning/pytorch-lightning/pull/6275))
- Added Autocast in validation, test and predict modes for Native AMP ([#6565](https://github.com/PyTorchLightning/pytorch-lightning/pull/6565))

Reviewed By: shuyingsunshine21

Differential Revision: D27528929

fbshipit-source-id: 311c88f71461c2c79bbf185e28d7a6d683ccc26f

44e41084

09 Apr, 2021 2 commits

Fix checkpoint callback in test_quantization test · 5f9e2cc1

Ananth Subramaniam authored Apr 09, 2021

Summary: `checkpoint_callback` now only accepts boolean values: https://github.com/PyTorchLightning/pytorch-lightning/blob/19e67d18c472c3a03dec4dd9bfcef031e9ca8719/pytorch_lightning/trainer/connectors/callback_connector.py#L65-L73

Reviewed By: shuyingsunshine21

Differential Revision: D27682178

fbshipit-source-id: 9e863aad7a23a76dee8ae5df9f5a78e7a94bfe8a

5f9e2cc1

Make checkpointing tests slightly less restrictive · fc5616c8

Ananth Subramaniam authored Apr 09, 2021

Summary:
Before: this test would assume only 2 checkpoints were stored: `last.ckpt`, and `FINAL_MODEL_CKPT`
Now: this test asserts that at least these 2 checkpoints are stored. In case the config specifies `save_top_k=-1` for instance, we'd save more checkpoints, causing this test to fail

Since this test is only loading the last and the final outputs, I'm changing the behavior to assert that these checkpoints must be saved and ignoring other checkpoint files that could be generated.

Reviewed By: kazhang

Differential Revision: D27671284

fbshipit-source-id: 0419fb46856d048e7b6eba3ff1dc65b7280a9a90

fc5616c8