1. 26 Apr, 2022 1 commit
  2. 19 Apr, 2022 1 commit
  3. 16 Mar, 2022 1 commit
  4. 08 Mar, 2022 2 commits
  5. 05 Mar, 2022 1 commit
  6. 04 Mar, 2022 1 commit
  7. 23 Feb, 2022 1 commit
  8. 13 Jan, 2022 1 commit
    • Tsahi Glik's avatar
      Add support for custom training step via meta_arch · b6e244d2
      Tsahi Glik authored
      Summary:
      Add support in the default lightning task to run a custom training step from Meta Arch if exists.
      The goal is to allow custom training step without the need to inherit from the default lightning task class and override it. This will allow us to use a signle lightning task and still allow users to customize the training step. In the long run this will be further encapsulated in modeling hook, making it more modular and compositable with other custom code.
      
      This change is a follow up from discussion in  https://fburl.com/diff/yqlsypys
      
      Reviewed By: wat3rBro
      
      Differential Revision: D33534624
      
      fbshipit-source-id: 560f06da03f218e77ad46832be9d741417882c56
      b6e244d2
  9. 29 Dec, 2021 1 commit
  10. 18 Nov, 2021 1 commit
    • Ananth Subramaniam's avatar
      remove deprecated train_loop (#10482) · bb49d171
      Ananth Subramaniam authored
      Summary:
      ### New commit log messages
        fa0ed17f8 remove deprecated train_loop (#10482)
      
      Reviewed By: kandluis
      
      Differential Revision: D32454980
      
      fbshipit-source-id: a35237dde06cc9ddac5373b75992ce88a6771c76
      bb49d171
  11. 28 Oct, 2021 1 commit
    • Kai Zhang's avatar
      Fix unused param in QAT training · 8b03f9aa
      Kai Zhang authored
      Summary:
      In quantization callback, we prepare the model with FX quantization API and only use the prepared model in training.
      However, when training in DDP, the parameters in the origin model still require grad, causing unused parameters RuntimeError.
      Previously, Lightning trainer train the model with find_unused_param flag, but if user manually disable it, they will get the runtime error.
      
      In this diff, the parameters in the origin model are frozen. We could consider deleting the origin model after preparation to save memory, but we might have to make some assumption on Lightning module structure, for example, `.model` is the origin model, so that we could `delattr(pl_module, "model")`.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D31902368
      
      fbshipit-source-id: 56eabb6b2296278529dd2b94d6aa4c9ec9e9ca6b
      8b03f9aa
  12. 20 Oct, 2021 2 commits
    • Peizhao Zhang's avatar
      Supported learnable qat. · f6ce583e
      Peizhao Zhang authored
      Summary:
      Supported learnable qat.
      * Added a config key `QUANTIZATION.QAT.FAKE_QUANT_METHOD` to specify the qat metod (`default` or `learnable`).
      * Added a config key `QUANTIZATION.QAT.ENABLE_LEARNABLE_OBSERVER_ITER` to specify the start iteration for learnable observers (before that it is using static observers).
      * Custom quantization code needs to call ` d2go.utils.qat_utils.get_qat_qconfig()` to get proper qconfig for learnable qat. An exception will raise if qat method is learnable but no learnable observers are used in the model.
      * Set the weight decay for scale/zero_point to 0 for the optimizer automatically.
      * The way to use larnable qat: enable static observers -> enable fake quant -> enable learnable observers -> freeze bn.
      
      Differential Revision: D31370822
      
      fbshipit-source-id: a5a5044a539d0d7fe1cc6b36e6821fc411ce752a
      f6ce583e
    • Peizhao Zhang's avatar
      Refactored qat related code. · ef9c20cc
      Peizhao Zhang authored
      Summary:
      Refactored qat related code.
      * Moved `_prepare_model_for_qat` related code to a function.
      * Moved `_setup_non_qat_to_qat_state_dict_map` related code to a function.
      * Moved QATHook related code to the quantization file and implemented as a class.
      
      Differential Revision: D31370819
      
      fbshipit-source-id: 836550b2c8d68cd93a84d5877ad9cef6f0f0eb39
      ef9c20cc
  13. 06 Oct, 2021 1 commit
  14. 17 May, 2021 1 commit
    • Kai Zhang's avatar
      add dataset visualization · 536e9d25
      Kai Zhang authored
      Summary: Add dataset visualization so that we could visualize test results in Tensorboard.
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D28457363
      
      fbshipit-source-id: 4c2fd9dce349c6fb9e1cec51c9138cf0abb45d7b
      536e9d25
  15. 12 May, 2021 1 commit
    • Luis Perez's avatar
      Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to... · 0848c589
      Luis Perez authored
      Synchronize PyTorchLightning/pytorch-lightning (revision 7b283e3c@master) to github/third-party/PyTorchLightning/pytorch-lightning
      
      Summary:
      # Manual
       - remove fixme's in `model_checkpoint.py`, `parameter_monitor.py`, `test_quantization.py`, and `speed_monitor.py` now that `Trainer` is properly annotated.
      - update `test_quantization.py` to `trainer.train_loop.global_step` instead of `trainer.global_step` which is a read-only.
      - update `loop_callback.py` to read from `train_loop` for `batch_idx` (which is no longer available).
      
      # Automatic
      ### New commit log messages
        7b283e3c Bugfix/Multiple dataloaders (#7433)
        d7c44cc6 Docs: sync chlog 1.3.1 (#7478)
        fdf50a5e Mark certain Trainer APIs as protected (#7420)
        ad9118f0 remove trainer hidden state | sanity refactor [1 / n] (#7437)
        4a1134db Log epoch metrics before firing the `on_evaluation_end` hook (#7272)
        b65ae794 Automatically check `DataModule.has_{setup,teardown,prepare_data}` [2/2] (#7238)
        8660d8cf [pre-commit.ci] pre-commit autoupdate (#7475)
        f6fe715e Fix Sphinx argument deprecation (#7464)
      
      Reviewed By: shuyingsunshine21
      
      Differential Revision: D28353491
      
      fbshipit-source-id: 98b87d99e2f09b47b07270858fcbdb5d5299730b
      0848c589
  16. 28 Apr, 2021 1 commit
    • Ananth Subramaniam's avatar
      Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to... · a95c7983
      Ananth Subramaniam authored
      Synchronize PyTorchLightning/pytorch-lightning (revision 7fe8d184@master) to github/third-party/PyTorchLightning/pytorch-lightning
      
      Summary:
      ### New commit log messages
        7fe8d184 Do not `shuffle` in `LightningDataModule.from_datasets` for `IterableDataset` (#7053)
        bab72255 [fix] Add barriers before and after setup hook is run (#7202)
        f920ba29 [bugfix] Metric not logged properly in manual optimization (#7228)
        e147127c [feat] Add better support for predict + ddp 2/3 (#7215)
        ca6c87ff Add back `clip_gradients(model)` (#7231)
        3b36d81c Fixed `num_sanity_val_steps` affecting reproducibility of training data shuffling (#7014)
        5cf9afa1 Add fairscale install msg for Sharded Plugins (#7213)
        52a5cee0 Set smarter default for DDP sharded for performance optimization (#6937)
        dd5ec75e Deprecate save_function from model checkpoint callback (#7201)
        ac7d6a35 Fix `NeptuneLogger.log_text(step=None)` (#7194)
        6be0a859 Update teardown for TPU acc (#7211)
        bc3f08b0 [fix] Add barrier to accelerator's teardown (#6814)
        68eac4d9 Enforce Lightning module as source of truth for automatic optimization (#7130)
        44d775fc Update Error message for ProfileConnector (#7204)
        31fcd7d0 Deprecate write_predictions on the LightningModule (#7066)
        591b9cee make bug_report_model minimal (#7191)
        b3fe8366 Move metrics_to_scalars to a dedicated utilities file (#7180)
        f58865aa Properly set `LightningModule.device` after model replacement (#7188)
        8439aead Update FairScale on CI (#7017)
        92af3632 Fix `lr_finder` suggesting too high learning rates (#7076)
        d534e53e add missing predict docs (#7150)
      
      Reviewed By: kazhang
      
      Differential Revision: D28032962
      
      fbshipit-source-id: 18cd01e8ecc13fe25f0890ac0f4b20c3c3e1fed3
      a95c7983
  17. 21 Apr, 2021 1 commit
  18. 20 Apr, 2021 1 commit
  19. 19 Apr, 2021 1 commit
    • Peizhao Zhang's avatar
      Added hooks to report training progress to fblearner and keep alive. · bd6043ee
      Peizhao Zhang authored
      Summary:
      * Added a registry to register functions that could be used to register hooks for training.
        * TRAINER_HOOKS_REGISTRY: List of functions to add hooks for trainer, all functions in the registry will be called to add hooks
        * `func(hooks: List[HookBase]) -> None`
      
      Reviewed By: zhanghang1989
      
      Differential Revision: D27560806
      
      fbshipit-source-id: fcfa02623bfd08508b6083db2d318d08f7e3c0b8
      bd6043ee
  20. 17 Apr, 2021 1 commit
    • Kai Zhang's avatar
      Delegate to model's customization · aeb24a92
      Kai Zhang authored
      Summary: Delegate FX quantization callback's customization to model.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D27669212
      
      fbshipit-source-id: 2715546cf03134896da6f95ecddaf8503ff95d0b
      aeb24a92
  21. 14 Apr, 2021 1 commit
  22. 09 Apr, 2021 1 commit
  23. 31 Mar, 2021 1 commit
  24. 30 Mar, 2021 1 commit
    • Sam Tsai's avatar
      reorganize unit tests · a0658c4a
      Sam Tsai authored
      Summary: Separate unit tests into individual folder based on functionality.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D27132567
      
      fbshipit-source-id: 9a8200be530ca14c7ef42191d59795b05b9800cc
      a0658c4a