1. 28 Oct, 2021 1 commit
    • Kai Zhang's avatar
      Fix unused param in QAT training · 8b03f9aa
      Kai Zhang authored
      Summary:
      In quantization callback, we prepare the model with FX quantization API and only use the prepared model in training.
      However, when training in DDP, the parameters in the origin model still require grad, causing unused parameters RuntimeError.
      Previously, Lightning trainer train the model with find_unused_param flag, but if user manually disable it, they will get the runtime error.
      
      In this diff, the parameters in the origin model are frozen. We could consider deleting the origin model after preparation to save memory, but we might have to make some assumption on Lightning module structure, for example, `.model` is the origin model, so that we could `delattr(pl_module, "model")`.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D31902368
      
      fbshipit-source-id: 56eabb6b2296278529dd2b94d6aa4c9ec9e9ca6b
      8b03f9aa
  2. 06 Oct, 2021 1 commit
  3. 12 May, 2021 1 commit
    • Luis Perez's avatar
      Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to... · 0848c589
      Luis Perez authored
      Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to github/third-party/PyTorchLightning/pytorch-lightning
      
      Summary:
      # Manual
       - remove fixme's in `model_checkpoint.py`, `parameter_monitor.py`, `test_quantization.py`, and `speed_monitor.py` now that `Trainer` is properly annotated.
      - update `test_quantization.py` to `trainer.train_loop.global_step` instead of `trainer.global_step` which is a read-only.
      - update `loop_callback.py` to read from `train_loop` for `batch_idx` (which is no longer available).
      
      # Automatic
      ### New commit log messages
        7b283e3c Bugfix/Multiple dataloaders (#7433)
        d7c44cc6 Docs: sync chlog 1.3.1 (#7478)
        fdf50a5e Mark certain Trainer APIs as protected (#7420)
        ad9118f0 remove trainer hidden state | sanity refactor [1 / n] (#7437)
        4a1134db Log epoch metrics before firing the `on_evaluation_end` hook (#7272)
        b65ae794 Automatically check `DataModule.has_{setup,teardown,prepare_data}` [2/2] (#7238)
        8660d8cf [pre-commit.ci] pre-commit autoupdate (#7475)
        f6fe715e Fix Sphinx argument deprecation (#7464)
      
      Reviewed By: shuyingsunshine21
      
      Differential Revision: D28353491
      
      fbshipit-source-id: 98b87d99e2f09b47b07270858fcbdb5d5299730b
      0848c589
  4. 28 Apr, 2021 1 commit
    • Ananth Subramaniam's avatar
      Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to... · a95c7983
      Ananth Subramaniam authored
      Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to github/third-party/PyTorchLightning/pytorch-lightning
      
      Summary:
      ### New commit log messages
        7fe8d184 Do not `shuffle` in `LightningDataModule.from_datasets` for `IterableDataset` (#7053)
        bab72255 [fix] Add barriers before and after setup hook is run (#7202)
        f920ba29 [bugfix] Metric not logged properly in manual optimization (#7228)
        e147127c [feat] Add better support for predict + ddp 2/3 (#7215)
        ca6c87ff Add back `clip_gradients(model)` (#7231)
        3b36d81c Fixed `num_sanity_val_steps` affecting reproducibility of training data shuffling (#7014)
        5cf9afa1 Add fairscale install msg for Sharded Plugins (#7213)
        52a5cee0 Set smarter default for DDP sharded for performance optimization (#6937)
        dd5ec75e Deprecate save_function from model checkpoint callback (#7201)
        ac7d6a35 Fix `NeptuneLogger.log_text(step=None)` (#7194)
        6be0a859 Update teardown for TPU acc (#7211)
        bc3f08b0 [fix] Add barrier to accelerator's teardown (#6814)
        68eac4d9 Enforce Lightning module as source of truth for automatic optimization (#7130)
        44d775fc Update Error message for ProfileConnector (#7204)
        31fcd7d0 Deprecate write_predictions on the LightningModule (#7066)
        591b9cee make bug_report_model minimal (#7191)
        b3fe8366 Move metrics_to_scalars to a dedicated utilities file (#7180)
        f58865aa Properly set `LightningModule.device` after model replacement (#7188)
        8439aead Update FairScale on CI (#7017)
        92af3632 Fix `lr_finder` suggesting too high learning rates (#7076)
        d534e53e add missing predict docs (#7150)
      
      Reviewed By: kazhang
      
      Differential Revision: D28032962
      
      fbshipit-source-id: 18cd01e8ecc13fe25f0890ac0f4b20c3c3e1fed3
      a95c7983
  5. 09 Apr, 2021 1 commit
  6. 31 Mar, 2021 1 commit
  7. 30 Mar, 2021 1 commit
    • Sam Tsai's avatar
      reorganize unit tests · a0658c4a
      Sam Tsai authored
      Summary: Separate unit tests into individual folder based on functionality.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D27132567
      
      fbshipit-source-id: 9a8200be530ca14c7ef42191d59795b05b9800cc
      a0658c4a
  8. 20 Mar, 2021 1 commit
    • Yanghan Wang's avatar
      move test utils to core library · 9d238344
      Yanghan Wang authored
      Summary: Not d2go.tests is not a library for oss, move utils code to d2go.utils.testing
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D26706933
      
      fbshipit-source-id: 85767b66bbb6c67db05e11823beb4840220b2aa3
      9d238344
  9. 03 Mar, 2021 1 commit
    • Kai Zhang's avatar
      Copy quantization callback to D2go · 5d8068d8
      Kai Zhang authored
      Summary: As titled. Make a copy of quantization callback to unblock D2go OSS.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D26735525
      
      fbshipit-source-id: 12b77f04cfa1361e856b26ea218a262da1fadd88
      5d8068d8