Commits · 9dc1600b05ecf60fab556599b4c0bc6c32837449 · OpenDAS / d2go

17 May, 2021 1 commit

Kai Zhang authored May 17, 2021

Summary: Add dataset visualization so that we could visualize test results in Tensorboard.

Reviewed By: zhanghang1989

Differential Revision: D28457363

fbshipit-source-id: 4c2fd9dce349c6fb9e1cec51c9138cf0abb45d7b

536e9d25

12 May, 2021 1 commit

Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to... · 0848c589

Luis Perez authored May 11, 2021

Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
# Manual
 - remove fixme's in `model_checkpoint.py`, `parameter_monitor.py`, `test_quantization.py`, and `speed_monitor.py` now that `Trainer` is properly annotated.
- update `test_quantization.py` to `trainer.train_loop.global_step` instead of `trainer.global_step` which is a read-only.
- update `loop_callback.py` to read from `train_loop` for `batch_idx` (which is no longer available).

# Automatic
### New commit log messages
  7b283e3c Bugfix/Multiple dataloaders (#7433)
  d7c44cc6 Docs: sync chlog 1.3.1 (#7478)
  fdf50a5e Mark certain Trainer APIs as protected (#7420)
  ad9118f0 remove trainer hidden state | sanity refactor [1 / n] (#7437)
  4a1134db Log epoch metrics before firing the `on_evaluation_end` hook (#7272)
  b65ae794 Automatically check `DataModule.has_{setup,teardown,prepare_data}` [2/2] (#7238)
  8660d8cf [pre-commit.ci] pre-commit autoupdate (#7475)
  f6fe715e Fix Sphinx argument deprecation (#7464)

Reviewed By: shuyingsunshine21

Differential Revision: D28353491

fbshipit-source-id: 98b87d99e2f09b47b07270858fcbdb5d5299730b

0848c589

28 Apr, 2021 1 commit

Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to... · a95c7983

Ananth Subramaniam authored Apr 27, 2021

Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
### New commit log messages
7fe8d184 Do not `shuffle` in `LightningDataModule.from_datasets` for `IterableDataset` (#7053)
bab72255 [fix] Add barriers before and after setup hook is run (#7202)
f920ba29 [bugfix] Metric not logged properly in manual optimization (#7228)
e147127c [feat] Add better support for predict + ddp 2/3 (#7215)
ca6c87ff Add back `clip_gradients(model)` (#7231)
3b36d81c Fixed `num_sanity_val_steps` affecting reproducibility of training data shuffling (#7014)
5cf9afa1 Add fairscale install msg for Sharded Plugins (#7213)
52a5cee0 Set smarter default for DDP sharded for performance optimization (#6937)
dd5ec75e Deprecate save_function from model checkpoint callback (#7201)
ac7d6a35 Fix `NeptuneLogger.log_text(step=None)` (#7194)
6be0a859 Update teardown for TPU acc (#7211)
bc3f08b0 [fix] Add barrier to accelerator's teardown (#6814)
68eac4d9 Enforce Lightning module as source of truth for automatic optimization (#7130)
44d775fc Update Error message for ProfileConnector (#7204)
31fcd7d0 Deprecate write_predictions on the LightningModule (#7066)
591b9cee make bug_report_model minimal (#7191)
b3fe8366 Move metrics_to_scalars to a dedicated utilities file (#7180)
f58865aa Properly set `LightningModule.device` after model replacement (#7188)
8439aead Update FairScale on CI (#7017)
92af3632 Fix `lr_finder` suggesting too high learning rates (#7076)
d534e53e add missing predict docs (#7150)

Reviewed By: kazhang

Differential Revision: D28032962

fbshipit-source-id: 18cd01e8ecc13fe25f0890ac0f4b20c3c3e1fed3

a95c7983

21 Apr, 2021 1 commit

disable logger in Lightning task test · 1a7f16bb

Kai Zhang authored Apr 20, 2021

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/46

As titled. The test is flaky because the tensorboard logger might still be writing to temporary folder when we tear down the folder.

Reviewed By: ananthsub

Differential Revision: D27844504

fbshipit-source-id: 3987f9ec3cd05b2f193e75cd4d85109a46f4ee71

1a7f16bb

20 Apr, 2021 1 commit

Remove e2e_mask_rcnn_fbnet_600_qat test · a3f4276c

Kai Zhang authored Apr 20, 2021

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/49

Reviewed By: wat3rBro

Differential Revision: D27875007

fbshipit-source-id: 2f61a4a3de29f3583a54adc914ee5a7eb605a823

a3f4276c

19 Apr, 2021 1 commit

Added hooks to report training progress to fblearner and keep alive. · bd6043ee

Peizhao Zhang authored Apr 19, 2021

Summary:
* Added a registry to register functions that could be used to register hooks for training.
  * TRAINER_HOOKS_REGISTRY: List of functions to add hooks for trainer, all functions in the registry will be called to add hooks
  * `func(hooks: List[HookBase]) -> None`

Reviewed By: zhanghang1989

Differential Revision: D27560806

fbshipit-source-id: fcfa02623bfd08508b6083db2d318d08f7e3c0b8

bd6043ee

17 Apr, 2021 1 commit

Delegate to model's customization · aeb24a92

Kai Zhang authored Apr 17, 2021

Summary: Delegate FX quantization callback's customization to model.

Reviewed By: wat3rBro

Differential Revision: D27669212

fbshipit-source-id: 2715546cf03134896da6f95ecddaf8503ff95d0b

aeb24a92

14 Apr, 2021 1 commit

Synchronize PyTorchLightning/pytorch-lightning (revision 0b843848@master) to... · 44e41084

Ananth Subramaniam authored Apr 13, 2021

Synchronize PyTorchLightning/pytorch-lightning (revision 0b843848@master) to github/third-party/PyTorchLightning/pytorch-lightning

Summary:
### New commit log messages
## [UnReleased] - 2021-MM-DD

### Added

- Added more explicit exception message when trying to execute `trainer.test()` or `trainer.validate()` with `fast_dev_run=True` ([#6667](https://github.com/PyTorchLightning/pytorch-lightning/pull/6667))

- Added `LightningCLI` class to provide simple reproducibility with minimum boilerplate training cli. ([#4492](https://github.com/PyTorchLightning/pytorch-lightning/pull/4492))

- Trigger warning when non-metric logged value with multi processes hasn't been reduced ([#6417](https://github.com/PyTorchLightning/pytorch-lightning/pull/6417))

- Added `gradient_clip_algorithm` argument to Trainer for gradient clipping by value ([#6123](https://github.com/PyTorchLightning/pytorch-lightning/pull/6123)).

- Added a way to print to terminal without breaking up the progress bar ([#5470](https://github.com/PyTorchLightning/pytorch-lightning/pull/5470))

- Added support to checkpoint after training steps in `ModelCheckpoint` callback ([#6146](https://github.com/PyTorchLightning/pytorch-lightning/pull/6146))

- Added `checkpoint` parameter to callback's `on_save_checkpoint` hook ([#6072](https://github.com/PyTorchLightning/pytorch-lightning/pull/6072))

- Added `RunningStage.SANITY_CHECKING` ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Added `TrainerState.{FITTING,VALIDATING,TESTING,PREDICTING,TUNING}` ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Added `Trainer.validate()` method to perform one evaluation epoch over the validation set ([#4948](https://github.com/PyTorchLightning/pytorch-lightning/pull/4948))

- Added `LightningEnvironment` for Lightning-specific DDP ([#5915](https://github.com/PyTorchLightning/pytorch-lightning/pull/5915))

- Added `teardown()` hook to LightningDataModule ([#4673](https://github.com/PyTorchLightning/pytorch-lightning/pull/4673))

- Added `auto_insert_metric_name` parameter to `ModelCheckpoint` ([#6277](https://github.com/PyTorchLightning/pytorch-lightning/pull/6277))

- Added arg to `self.log` that enables users to give custom names when dealing with multiple dataloaders ([#6274](https://github.com/PyTorchLightning/pytorch-lightning/pull/6274))

- Added `teardown` method to `BaseProfiler` to enable subclasses defining post-profiling steps outside of `__del__` ([#6370](https://github.com/PyTorchLightning/pytorch-lightning/pull/6370))

- Added `setup` method to `BaseProfiler` to enable subclasses defining pre-profiling steps for every process ([#6633](https://github.com/PyTorchLightning/pytorch-lightning/pull/6633))

- Added no return warning to predict ([#6139](https://github.com/PyTorchLightning/pytorch-lightning/pull/6139))

- Added `Trainer.predict` config validation ([#6543](https://github.com/PyTorchLightning/pytorch-lightning/pull/6543))

- Added `AbstractProfiler` interface ([#6621](https://github.com/PyTorchLightning/pytorch-lightning/pull/6621))

- Added support for including module names for forward in the autograd trace of `PyTorchProfiler` ([#6349](https://github.com/PyTorchLightning/pytorch-lightning/pull/6349))

- Added support for the PyTorch 1.8.1 autograd profiler ([#6618](https://github.com/PyTorchLightning/pytorch-lightning/pull/6618))

- Added `outputs` parameter to callback's `on_validation_epoch_end` & `on_test_epoch_end` hooks ([#6120](https://github.com/PyTorchLightning/pytorch-lightning/pull/6120))

- Added `configure_sharded_model` hook ([#6679](https://github.com/PyTorchLightning/pytorch-lightning/pull/6679))

- Added support for `precision=64`, enabling training with double precision ([#6595](https://github.com/PyTorchLightning/pytorch-lightning/pull/6595))

- Added support for DDP communication hooks ([#6736](https://github.com/PyTorchLightning/pytorch-lightning/issues/6736))

- Added `artifact_location` argument to `MLFlowLogger` which will be passed to the `MlflowClient.create_experiment` call ([#6677](https://github.com/PyTorchLightning/pytorch-lightning/pull/6677))

- Added `model` parameter to precision plugins' `clip_gradients` signature ([#6764](https://github.com/PyTorchLightning/pytorch-lightning/pull/6764))

### Changed

- Renamed `pytorch_lightning.callbacks.swa` to `pytorch_lightning.callbacks.stochastic_weight_avg` ([#6259](https://github.com/PyTorchLightning/pytorch-lightning/pull/6259))

- Refactor `RunningStage` and `TrainerState` usage ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Changed `trainer.evaluating` to return `True` if validating or testing ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Changed `setup()` and `teardown()` stage argument to take any of `{fit,validate,test,predict}` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))

- Changed profilers to save separate report files per state and rank ([#6621](https://github.com/PyTorchLightning/pytorch-lightning/pull/6621))

- Changed `PyTorchProfiler` to use `torch.autograd.profiler.record_function` to record functions ([#6349](https://github.com/PyTorchLightning/pytorch-lightning/pull/6349))

### Deprecated

- `period` has been deprecated in favor of `every_n_val_epochs` in the `ModelCheckpoint` callback ([#6146](https://github.com/PyTorchLightning/pytorch-lightning/pull/6146))

- Deprecated `trainer.running_sanity_check` in favor of `trainer.sanity_checking` ([#4945](https://github.com/PyTorchLightning/pytorch-lightning/pull/4945))

- Deprecated `Profiler(output_filename)` in favor of `dirpath` and `filename` ([#6621](https://github.com/PyTorchLightning/pytorch-lightning/pull/6621))

- Deprecated `PytorchProfiler(profiled_functions)` in favor of `record_functions` ([#6349](https://github.com/PyTorchLightning/pytorch-lightning/pull/6349))

- Deprecated metrics in favor of `torchmetrics` ([#6505](https://github.com/PyTorchLightning/pytorch-lightning/pull/6505),
    [#6530](https://github.com/PyTorchLightning/pytorch-lightning/pull/6530),
    [#6540](https://github.com/PyTorchLightning/pytorch-lightning/pull/6540),
    [#6547](https://github.com/PyTorchLightning/pytorch-lightning/pull/6547),
    [#6515](https://github.com/PyTorchLightning/pytorch-lightning/pull/6515),
    [#6572](https://github.com/PyTorchLightning/pytorch-lightning/pull/6572),
    [#6573](https://github.com/PyTorchLightning/pytorch-lightning/pull/6573),
    [#6584](https://github.com/PyTorchLightning/pytorch-lightning/pull/6584),
    [#6636](https://github.com/PyTorchLightning/pytorch-lightning/pull/6636),
    [#6637](https://github.com/PyTorchLightning/pytorch-lightning/pull/6637),
    [#6649](https://github.com/PyTorchLightning/pytorch-lightning/pull/6649),
    [#6659](https://github.com/PyTorchLightning/pytorch-lightning/pull/6659),
)

### Removed

- Removed support for passing a bool value to `profiler` argument of Trainer ([#6164](https://github.com/PyTorchLightning/pytorch-lightning/pull/6164))

- Removed no return warning from val/test step ([#6139](https://github.com/PyTorchLightning/pytorch-lightning/pull/6139))

- Removed passing a `ModelCheckpoint` instance to `Trainer(checkpoint_callback)` ([#6166](https://github.com/PyTorchLightning/pytorch-lightning/pull/6166))

- Removed deprecated Trainer argument `enable_pl_optimizer` and `automatic_optimization` ([#6163](https://github.com/PyTorchLightning/pytorch-lightning/pull/6163))

- Removed deprecated metrics ([#6161](https://github.com/PyTorchLightning/pytorch-lightning/pull/6161))
    * from `pytorch_lightning.metrics.functional.classification` removed `to_onehot`, `to_categorical`, `get_num_classes`, `roc`, `multiclass_roc`, `average_precision`, `precision_recall_curve`, `multiclass_precision_recall_curve`
    * from `pytorch_lightning.metrics.functional.reduction` removed `reduce`, `class_reduce`

- Removed deprecated `ModelCheckpoint` arguments `prefix`, `mode="auto"` ([#6162](https://github.com/PyTorchLightning/pytorch-lightning/pull/6162))

- Removed `mode='auto'` from `EarlyStopping` ([#6167](https://github.com/PyTorchLightning/pytorch-lightning/pull/6167))

- Removed legacy references for magic keys in the `Result` object ([#6016](https://github.com/PyTorchLightning/pytorch-lightning/pull/6016))

- Removed deprecated `LightningModule` `hparams` setter ([#6207](https://github.com/PyTorchLightning/pytorch-lightning/pull/6207))

- Removed legacy code to log or include metrics in the progress bar by returning them in a dict with the `"log"/"progress_bar"` magic keys. Use `self.log` instead ([#6734](https://github.com/PyTorchLightning/pytorch-lightning/pull/6734))

- Removed `optimizer_idx` argument from `training_step` in manual optimization ([#6093](https://github.com/PyTorchLightning/pytorch-lightning/pull/6093))

### Fixed

- Set better defaults for `rank_zero_only.rank` when training is launched with SLURM and torchelastic ([#6802](https://github.com/PyTorchLightning/pytorch-lightning/pull/6802/))

- Made the `Plugin.reduce` method more consistent across all Plugins to reflect a mean-reduction by default ([#6011](https://github.com/PyTorchLightning/pytorch-lightning/pull/6011))

- Move lightning module to correct device type when using LightningDistributedWrapper ([#6070](https://github.com/PyTorchLightning/pytorch-lightning/pull/6070))

- Do not print top-k verbose log with `ModelCheckpoint(monitor=None)` ([#6109](https://github.com/PyTorchLightning/pytorch-lightning/pull/6109))

- Fixed csv extension check ([#6436](https://github.com/PyTorchLightning/pytorch-lightning/pull/6436))

- Fixed `ModelCheckpoint(monitor=None, save_last=True)` not saving checkpoints ([#6136](https://github.com/PyTorchLightning/pytorch-lightning/pull/6136))

- Fixed `ModelCheckpoint(save_top_k=0, save_last=True)` not saving the `last` checkpoint ([#6136](https://github.com/PyTorchLightning/pytorch-lightning/pull/6136))

- Fixed `.teardown(stage='fit')` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))

- Fixed `.on_fit_{start,end}()` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))

- Fixed LightningModule `all_gather` on cpu tensors ([#6416](https://github.com/PyTorchLightning/pytorch-lightning/pull/6416))

- Fixed torch distributed not available in setup hook for DDP ([#6506](https://github.com/PyTorchLightning/pytorch-lightning/pull/6506))

- Fixed `EarlyStopping` logic when `min_epochs` or `min_steps` requirement is not met ([#6705](https://github.com/PyTorchLightning/pytorch-lightning/pull/6705))

## [1.2.7] - 2021-04-06

### Fixed

- Fixed resolve a bug with omegaconf and xm.save ([#6741](https://github.com/PyTorchLightning/pytorch-lightning/pull/6741))
- Fixed an issue with IterableDataset when __len__ is not defined ([#6828](https://github.com/PyTorchLightning/pytorch-lightning/pull/6828))
- Sanitize None params during pruning ([#6836](https://github.com/PyTorchLightning/pytorch-lightning/pull/6836))
- Enforce an epoch scheduler interval when using SWA ([#6588](https://github.com/PyTorchLightning/pytorch-lightning/pull/6588))
- Fixed TPU Colab hang issue, post training ([#6816](https://github.com/PyTorchLightning/pytorch-lightning/pull/6816))
- Fixed a bug where `TensorBoardLogger` would give a warning and not log correctly to a symbolic link `save_dir` ([#6730](https://github.com/PyTorchLightning/pytorch-lightning/pull/6730))

## [1.2.6] - 2021-03-30

### Changed

- Changed the behavior of `on_epoch_start` to run at the beginning of validation & test epoch ([#6498](https://github.com/PyTorchLightning/pytorch-lightning/pull/6498))

### Removed

- Removed legacy code to include `step` dictionary returns in `callback_metrics`. Use `self.log_dict` instead. ([#6682](https://github.com/PyTorchLightning/pytorch-lightning/pull/6682))

### Fixed

- Fixed `DummyLogger.log_hyperparams` raising a `TypeError` when running with `fast_dev_run=True` ([#6398](https://github.com/PyTorchLightning/pytorch-lightning/pull/6398))
- Fixed error on TPUs when there was no `ModelCheckpoint` ([#6654](https://github.com/PyTorchLightning/pytorch-lightning/pull/6654))
- Fixed `trainer.test` freeze on TPUs ([#6654](https://github.com/PyTorchLightning/pytorch-lightning/pull/6654))
- Fixed a bug where gradients were disabled after calling `Trainer.predict` ([#6657](https://github.com/PyTorchLightning/pytorch-lightning/pull/6657))
- Fixed bug where no TPUs were detected in a TPU pod env ([#6719](https://github.com/PyTorchLightning/pytorch-lightning/pull/6719))

## [1.2.5] - 2021-03-23

### Changed

- Update Gradient Clipping for the TPU Accelerator ([#6576](https://github.com/PyTorchLightning/pytorch-lightning/pull/6576))
- Refactored setup for typing friendly ([#6590](https://github.com/PyTorchLightning/pytorch-lightning/pull/6590))

### Fixed

- Fixed a bug where `all_gather` would not work correctly with `tpu_cores=8` ([#6587](https://github.com/PyTorchLightning/pytorch-lightning/pull/6587))
- Fixed comparing required versions ([#6434](https://github.com/PyTorchLightning/pytorch-lightning/pull/6434))
- Fixed duplicate logs appearing in console when using the python logging module ([#6275](https://github.com/PyTorchLightning/pytorch-lightning/pull/6275))
- Added Autocast in validation, test and predict modes for Native AMP ([#6565](https://github.com/PyTorchLightning/pytorch-lightning/pull/6565))

Reviewed By: shuyingsunshine21

Differential Revision: D27528929

fbshipit-source-id: 311c88f71461c2c79bbf185e28d7a6d683ccc26f

44e41084

09 Apr, 2021 1 commit

Fix checkpoint callback in test_quantization test · 5f9e2cc1

Ananth Subramaniam authored Apr 09, 2021

Summary: `checkpoint_callback` now only accepts boolean values: https://github.com/PyTorchLightning/pytorch-lightning/blob/19e67d18c472c3a03dec4dd9bfcef031e9ca8719/pytorch_lightning/trainer/connectors/callback_connector.py#L65-L73

Reviewed By: shuyingsunshine21

Differential Revision: D27682178

fbshipit-source-id: 9e863aad7a23a76dee8ae5df9f5a78e7a94bfe8a

5f9e2cc1

31 Mar, 2021 1 commit

Sync quantization callback changes · 1850a632

Kai Zhang authored Mar 31, 2021

Reviewed By: newstzpz

Differential Revision: D27255960

fbshipit-source-id: 1699ff23d2bc610dffc0215a90a7c1c17e3783c3

1850a632

30 Mar, 2021 1 commit

reorganize unit tests · a0658c4a

Sam Tsai authored Mar 30, 2021

Summary: Separate unit tests into individual folder based on functionality.

Reviewed By: wat3rBro

Differential Revision: D27132567

fbshipit-source-id: 9a8200be530ca14c7ef42191d59795b05b9800cc

a0658c4a