1. 02 Aug, 2022 1 commit
    • Jeremy Reizenstein's avatar
      remove get_task · f8bf5280
      Jeremy Reizenstein authored
      Summary: Remove the dataset's need to provide the task type.
      
      Reviewed By: davnov134, kjchalup
      
      Differential Revision: D38314000
      
      fbshipit-source-id: 3805d885b5d4528abdc78c0da03247edb9abf3f7
      f8bf5280
  2. 01 Aug, 2022 1 commit
    • David Novotny's avatar
      Better seeding of random engines · 80fc0ee0
      David Novotny authored
      Summary: Currently, seeds are set only inside the train loop. But this does not ensure that the model weights are initialized the same way everywhere which makes all experiments irreproducible. This diff fixes it.
      
      Reviewed By: bottler
      
      Differential Revision: D38315840
      
      fbshipit-source-id: 3d2ecebbc36072c2b68dd3cd8c5e30708e7dd808
      80fc0ee0
  3. 30 Jul, 2022 1 commit
    • Krzysztof Chalupka's avatar
      Replace pluggable components to create a proper Configurable hierarchy. · 1b0584f7
      Krzysztof Chalupka authored
      Summary:
      This large diff rewrites a significant portion of Implicitron's config hierarchy. The new hierarchy, and some of the default implementation classes, are as follows:
      ```
      Experiment
          data_source: ImplicitronDataSource
              dataset_map_provider
              data_loader_map_provider
          model_factory: ImplicitronModelFactory
              model: GenericModel
          optimizer_factory: ImplicitronOptimizerFactory
          training_loop: ImplicitronTrainingLoop
              evaluator: ImplicitronEvaluator
      ```
      
      1) Experiment (used to be ExperimentConfig) is now a top-level Configurable and contains as members mainly (mostly new) high-level factory Configurables.
      2) Experiment's job is to run factories, do some accelerate setup and then pass the results to the main training loop.
      3) ImplicitronOptimizerFactory and ImplicitronModelFactory are new high-level factories that create the optimizer, scheduler, model, and stats objects.
      4) TrainingLoop is a new configurable that runs the main training loop and the inner train-validate step.
      5) Evaluator is a new configurable that TrainingLoop uses to run validation/test steps.
      6) GenericModel is not the only model choice anymore. Instead, ImplicitronModelBase (by default instantiated with GenericModel) is a member of Experiment and can be easily replaced by a custom implementation by the user.
      
      All the new Configurables are children of ReplaceableBase, and can be easily replaced with custom implementations.
      
      In addition, I added support for the exponential LR schedule, updated the config files and the test, as well as added a config file that reproduces NERF results and a test to run the repro experiment.
      
      Reviewed By: bottler
      
      Differential Revision: D37723227
      
      fbshipit-source-id: b36bee880d6aa53efdd2abfaae4489d8ab1e8a27
      1b0584f7
  4. 21 Jul, 2022 1 commit
    • Jeremy Reizenstein's avatar
      lazy all_train_cameras · 3783437d
      Jeremy Reizenstein authored
      Summary: Avoid calculating all_train_cameras before it is needed, because it is slow in some datasets.
      
      Reviewed By: shapovalov
      
      Differential Revision: D38037157
      
      fbshipit-source-id: 95461226655cde2626b680661951ab17ebb0ec75
      3783437d
  5. 17 Jul, 2022 1 commit
    • Jeremy Reizenstein's avatar
      option to avoid accelerate · 9b2e5705
      Jeremy Reizenstein authored
      Summary: For debugging, introduce PYTORCH3D_NO_ACCELERATE env var.
      
      Reviewed By: shapovalov
      
      Differential Revision: D37885393
      
      fbshipit-source-id: de080080c0aa4b6d874028937083a0113bb97c23
      9b2e5705
  6. 13 Jul, 2022 1 commit
    • Roman Shapovalov's avatar
      Fix: making visualisation work again · 4261e59f
      Roman Shapovalov authored
      Summary:
      1. Respecting `visdom_show_preds` parameter when it is False.
      2. Clipping the images pre-visualisation, which is important for methods like SRN that are not arare of pixel value range.
      
      Reviewed By: bottler
      
      Differential Revision: D37786439
      
      fbshipit-source-id: 8dbb5104290bcc5c2829716b663cae17edc911bd
      4261e59f
  7. 12 Jul, 2022 1 commit
    • Nikhila Ravi's avatar
      Updates to support Accelerate and multigpu training (#37) · aa8b03f3
      Nikhila Ravi authored
      Summary:
      ## Changes:
      - Added Accelerate Library and refactored experiment.py to use it
      - Needed to move `init_optimizer` and `ExperimentConfig` to a separate file to be compatible with submitit/hydra
      - Needed to make some modifications to data loaders etc to work well with the accelerate ddp wrappers
      - Loading/saving checkpoints incorporates an unwrapping step so remove the ddp wrapped model
      
      ## Tests
      
      Tested with both `torchrun` and `submitit/hydra` on two gpus locally. Here are the commands:
      
      **Torchrun**
      
      Modules loaded:
      ```sh
      1) anaconda3/2021.05   2) cuda/11.3   3) NCCL/2.9.8-3-cuda.11.3   4) gcc/5.2.0. (but unload gcc when using submit)
      ```
      
      ```sh
      torchrun --nnodes=1 --nproc_per_node=2 experiment.py --config-path ./configs --config-name repro_singleseq_nerf_test
      ```
      
      **Submitit/Hydra Local test**
      
      ```sh
      ~/pytorch3d/projects/implicitron_trainer$ HYDRA_FULL_ERROR=1 python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs  hydra/launcher=submitit_local hydra.launcher.gpus_per_node=2 hydra.launcher.tasks_per_node=2 hydra.launcher.nodes=1
      ```
      
      **Submitit/Hydra distributed test**
      
      ```sh
      ~/implicitron/pytorch3d$ python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs  hydra/launcher=submitit_slurm hydra.launcher.gpus_per_node=8 hydra.launcher.tasks_per_node=8 hydra.launcher.nodes=1 hydra.launcher.partition=learnlab hydra.launcher.timeout_min=4320
      ```
      
      ## TODOS:
      - Fix distributed evaluation: currently this doesn't work as the input format to the evaluation function is not suitable for gathering across gpus (needs to be nested list/tuple/dicts of objects that satisfy `is_torch_tensor`) and currently `frame_data`  contains `Cameras` type.
      - Refactor the `accelerator` object to be accessible by all functions instead of needing to pass it around everywhere? Maybe have a `Trainer` class and add it as a method?
      - Update readme with installation instructions for accelerate and also commands for running jobs with torchrun and submitit/hydra
      
      X-link: https://github.com/fairinternal/pytorch3d/pull/37
      
      Reviewed By: davnov134, kjchalup
      
      Differential Revision: D37543870
      
      Pulled By: bottler
      
      fbshipit-source-id: be9eb4e91244d4fe3740d87dafec622ae1e0cf76
      aa8b03f3
  8. 06 Jul, 2022 3 commits
    • Jeremy Reizenstein's avatar
      extract camera_difficulty_bin_breaks · efb72132
      Jeremy Reizenstein authored
      Summary: As part of removing Task, move camera difficulty bin breaks from hard code to the top level.
      
      Reviewed By: davnov134
      
      Differential Revision: D37491040
      
      fbshipit-source-id: f2d6775ebc490f6f75020d13f37f6b588cc07a0b
      efb72132
    • Jeremy Reizenstein's avatar
      typing for trainer · 40fb189c
      Jeremy Reizenstein authored
      Summary: Enable pyre checking of the trainer code.
      
      Reviewed By: shapovalov
      
      Differential Revision: D36545438
      
      fbshipit-source-id: db1ea8d1ade2da79a2956964eb0c7ba302fa40d1
      40fb189c
    • Jeremy Reizenstein's avatar
      get_all_train_cameras · 4e87c2b7
      Jeremy Reizenstein authored
      Summary: As part of removing Task, make the dataset code generate the source cameras for itself. There's a small optimization available here, in that the JsonIndexDataset could avoid loading images.
      
      Reviewed By: shapovalov
      
      Differential Revision: D37313423
      
      fbshipit-source-id: 3e5e0b2aabbf9cc51f10547a3523e98c72ad8755
      4e87c2b7
  9. 16 Jun, 2022 1 commit
    • Jeremy Reizenstein's avatar
      loading llff and blender datasets · 65f667fd
      Jeremy Reizenstein authored
      Summary: Copy code from NeRF for loading LLFF data and blender synthetic data, and create dataset objects for them
      
      Reviewed By: shapovalov
      
      Differential Revision: D35581039
      
      fbshipit-source-id: af7a6f3e9a42499700693381b5b147c991f57e5d
      65f667fd
  10. 10 Jun, 2022 2 commits
    • Jeremy Reizenstein's avatar
      test configs are loadable · 023a2369
      Jeremy Reizenstein authored
      Summary: Add test that the yaml files deserialize.
      
      Reviewed By: davnov134
      
      Differential Revision: D36830673
      
      fbshipit-source-id: b785d8db97b676686036760bfa2dd3fa638bda57
      023a2369
    • Jeremy Reizenstein's avatar
      make ExperimentConfig Configurable · c0f88e04
      Jeremy Reizenstein authored
      Summary: Preparing for pluggables in experiment.py
      
      Reviewed By: davnov134
      
      Differential Revision: D36830674
      
      fbshipit-source-id: eab499d1bc19c690798fbf7da547544df7e88fa5
      c0f88e04
  11. 26 May, 2022 1 commit
    • Jeremy Reizenstein's avatar
      test runner for experiment.py · c31bf85a
      Jeremy Reizenstein authored
      Summary: Add simple interactive testrunner for experiment.py
      
      Reviewed By: shapovalov
      
      Differential Revision: D35316221
      
      fbshipit-source-id: d424bcba632eef89eefb56e18e536edb58ec6f85
      c31bf85a
  12. 25 May, 2022 1 commit
  13. 20 May, 2022 4 commits
    • Jeremy Reizenstein's avatar
      data_loader_map_provider · 0f12c516
      Jeremy Reizenstein authored
      Summary: replace dataloader_zoo with a pluggable DataLoaderMapProvider.
      
      Reviewed By: shapovalov
      
      Differential Revision: D36475441
      
      fbshipit-source-id: d16abb190d876940434329928f2e3f2794a25416
      0f12c516
    • Jeremy Reizenstein's avatar
      dataset_map_provider · 79c61a2d
      Jeremy Reizenstein authored
      Summary: replace dataset_zoo with a pluggable DatasetMapProvider. The logic is now in annotated_file_dataset_map_provider.
      
      Reviewed By: shapovalov
      
      Differential Revision: D36443965
      
      fbshipit-source-id: 9087649802810055e150b2fbfcc3c197a761f28a
      79c61a2d
    • Jeremy Reizenstein's avatar
      New file for ImplicitronDatasetBase · 69c6d06e
      Jeremy Reizenstein authored
      Summary: Separate ImplicitronDatasetBase and FrameData (to be used by all data sources) from ImplicitronDataset (which is specific).
      
      Reviewed By: shapovalov
      
      Differential Revision: D36413111
      
      fbshipit-source-id: 3725744cde2e08baa11aff4048237ba10c7efbc6
      69c6d06e
    • Jeremy Reizenstein's avatar
      data_source · 73dc109d
      Jeremy Reizenstein authored
      Summary:
      Move dataset_args and dataloader_args from ExperimentConfig into a new member called datasource so that it can contain replaceables.
      
      Also add enum Task for task type.
      
      Reviewed By: shapovalov
      
      Differential Revision: D36201719
      
      fbshipit-source-id: 47d6967bfea3b7b146b6bbd1572e0457c9365871
      73dc109d
  14. 13 May, 2022 1 commit
  15. 09 May, 2022 1 commit
  16. 06 Apr, 2022 1 commit
  17. 31 Mar, 2022 1 commit
  18. 25 Mar, 2022 1 commit
    • Roman Shapovalov's avatar
      Return a typed structured config from default_args for callables · 645a47d0
      Roman Shapovalov authored
      Summary:
      Before the fix, running get_default_args(C: Callable) returns an unstructured DictConfig which causes Enums to be handled incorrectly. This is a fix.
      
      WIP update: Currently tests still fail whenever a function signature contains an untyped argument: This needs to be somehow fixed.
      
      Reviewed By: bottler
      
      Differential Revision: D34932124
      
      fbshipit-source-id: ecdc45c738633cfea5caa7480ba4f790ece931e8
      645a47d0
  19. 21 Mar, 2022 1 commit