Commits · fdaaa299a72eddbfff30a5f31fe2a643a2a4ca42 · OpenDAS / pytorch3d

18 Aug, 2022 1 commit

remove stray "generic_model_args" references · fdaaa299

Jeremy Reizenstein authored Aug 18, 2022

Summary:
generic_model_args no longer exists. Update some references to it, mostly in doc.

This fixes the testing of all the yaml files in test_forward pass.

Reviewed By: shapovalov

Differential Revision: D38789202

fbshipit-source-id: f11417efe772d7f86368b3598aa66c52b1309dbf

fdaaa299

02 Aug, 2022 1 commit

Move load_stats to TrainingLoop · c3f8dad5

David Novotny authored Aug 02, 2022

Summary:
Stats are logically connected to the training loop, not to the model. Hence, moving to the training loop.

Also removing resume_epoch from OptimizerFactory in favor of a single place - ModelFactory. This removes the need for config consistency checks etc.

Reviewed By: kjchalup

Differential Revision: D38313475

fbshipit-source-id: a1d188a63e28459df381ff98ad8acdcdb14887b7

c3f8dad5

30 Jul, 2022 1 commit

Replace pluggable components to create a proper Configurable hierarchy. · 1b0584f7

Krzysztof Chalupka authored Jul 29, 2022

Summary:
This large diff rewrites a significant portion of Implicitron's config hierarchy. The new hierarchy, and some of the default implementation classes, are as follows:
```
Experiment
data_source: ImplicitronDataSource
dataset_map_provider
data_loader_map_provider
model_factory: ImplicitronModelFactory
model: GenericModel
optimizer_factory: ImplicitronOptimizerFactory
training_loop: ImplicitronTrainingLoop
evaluator: ImplicitronEvaluator
```

1) Experiment (used to be ExperimentConfig) is now a top-level Configurable and contains as members mainly (mostly new) high-level factory Configurables.
2) Experiment's job is to run factories, do some accelerate setup and then pass the results to the main training loop.
3) ImplicitronOptimizerFactory and ImplicitronModelFactory are new high-level factories that create the optimizer, scheduler, model, and stats objects.
4) TrainingLoop is a new configurable that runs the main training loop and the inner train-validate step.
5) Evaluator is a new configurable that TrainingLoop uses to run validation/test steps.
6) GenericModel is not the only model choice anymore. Instead, ImplicitronModelBase (by default instantiated with GenericModel) is a member of Experiment and can be easily replaced by a custom implementation by the user.

All the new Configurables are children of ReplaceableBase, and can be easily replaced with custom implementations.

In addition, I added support for the exponential LR schedule, updated the config files and the test, as well as added a config file that reproduces NERF results and a test to run the repro experiment.

Reviewed By: bottler

Differential Revision: D37723227

fbshipit-source-id: b36bee880d6aa53efdd2abfaae4489d8ab1e8a27

1b0584f7

17 Jul, 2022 1 commit

option to avoid accelerate · 9b2e5705

Jeremy Reizenstein authored Jul 17, 2022

Summary: For debugging, introduce PYTORCH3D_NO_ACCELERATE env var.

Reviewed By: shapovalov

Differential Revision: D37885393

fbshipit-source-id: de080080c0aa4b6d874028937083a0113bb97c23

9b2e5705

13 Jul, 2022 1 commit

Fix: making visualisation work again · 4261e59f

Roman Shapovalov authored Jul 13, 2022

Summary:
1. Respecting `visdom_show_preds` parameter when it is False.
2. Clipping the images pre-visualisation, which is important for methods like SRN that are not arare of pixel value range.

Reviewed By: bottler

Differential Revision: D37786439

fbshipit-source-id: 8dbb5104290bcc5c2829716b663cae17edc911bd

4261e59f

12 Jul, 2022 1 commit

Updates to support Accelerate and multigpu training (#37) · aa8b03f3

Nikhila Ravi authored Jul 11, 2022

Summary:
## Changes:
- Added Accelerate Library and refactored experiment.py to use it
- Needed to move `init_optimizer` and `ExperimentConfig` to a separate file to be compatible with submitit/hydra
- Needed to make some modifications to data loaders etc to work well with the accelerate ddp wrappers
- Loading/saving checkpoints incorporates an unwrapping step so remove the ddp wrapped model

## Tests

Tested with both `torchrun` and `submitit/hydra` on two gpus locally. Here are the commands:

**Torchrun**

Modules loaded:
```sh
1) anaconda3/2021.05   2) cuda/11.3   3) NCCL/2.9.8-3-cuda.11.3   4) gcc/5.2.0. (but unload gcc when using submit)
```

```sh
torchrun --nnodes=1 --nproc_per_node=2 experiment.py --config-path ./configs --config-name repro_singleseq_nerf_test
```

**Submitit/Hydra Local test**

```sh
~/pytorch3d/projects/implicitron_trainer$ HYDRA_FULL_ERROR=1 python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs  hydra/launcher=submitit_local hydra.launcher.gpus_per_node=2 hydra.launcher.tasks_per_node=2 hydra.launcher.nodes=1
```

**Submitit/Hydra distributed test**

```sh
~/implicitron/pytorch3d$ python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs  hydra/launcher=submitit_slurm hydra.launcher.gpus_per_node=8 hydra.launcher.tasks_per_node=8 hydra.launcher.nodes=1 hydra.launcher.partition=learnlab hydra.launcher.timeout_min=4320
```

## TODOS:
- Fix distributed evaluation: currently this doesn't work as the input format to the evaluation function is not suitable for gathering across gpus (needs to be nested list/tuple/dicts of objects that satisfy `is_torch_tensor`) and currently `frame_data`  contains `Cameras` type.
- Refactor the `accelerator` object to be accessible by all functions instead of needing to pass it around everywhere? Maybe have a `Trainer` class and add it as a method?
- Update readme with installation instructions for accelerate and also commands for running jobs with torchrun and submitit/hydra

X-link: https://github.com/fairinternal/pytorch3d/pull/37

Reviewed By: davnov134, kjchalup

Differential Revision: D37543870

Pulled By: bottler

fbshipit-source-id: be9eb4e91244d4fe3740d87dafec622ae1e0cf76

aa8b03f3

06 Jul, 2022 1 commit

typing for trainer · 40fb189c

Jeremy Reizenstein authored Jul 06, 2022

Summary: Enable pyre checking of the trainer code.

Reviewed By: shapovalov

Differential Revision: D36545438

fbshipit-source-id: db1ea8d1ade2da79a2956964eb0c7ba302fa40d1

40fb189c

25 May, 2022 1 commit

rename ImplicitronDataset to JsonIndexDataset · fbd3c679

Jeremy Reizenstein authored May 25, 2022

Summary: The ImplicitronDataset class corresponds to JsonIndexDatasetMapProvider

Reviewed By: shapovalov

Differential Revision: D36661396

fbshipit-source-id: 80ca2ff81ef9ecc2e3d1f4e1cd14b6f66a7ec34d

fbd3c679

20 May, 2022 3 commits

dataset_map_provider · 79c61a2d

Jeremy Reizenstein authored May 20, 2022

Summary: replace dataset_zoo with a pluggable DatasetMapProvider. The logic is now in annotated_file_dataset_map_provider.

Reviewed By: shapovalov

Differential Revision: D36443965

fbshipit-source-id: 9087649802810055e150b2fbfcc3c197a761f28a

79c61a2d

New file for ImplicitronDatasetBase · 69c6d06e

Jeremy Reizenstein authored May 20, 2022

Summary: Separate ImplicitronDatasetBase and FrameData (to be used by all data sources) from ImplicitronDataset (which is specific).

Reviewed By: shapovalov

Differential Revision: D36413111

fbshipit-source-id: 3725744cde2e08baa11aff4048237ba10c7efbc6

69c6d06e

data_source · 73dc109d

Jeremy Reizenstein authored May 20, 2022

Summary:
Move dataset_args and dataloader_args from ExperimentConfig into a new member called datasource so that it can contain replaceables.

Also add enum Task for task type.

Reviewed By: shapovalov

Differential Revision: D36201719

fbshipit-source-id: 47d6967bfea3b7b146b6bbd1572e0457c9365871

73dc109d

13 May, 2022 1 commit

return types for dataset_zoo, dataloader_zoo · 2c190152

Jeremy Reizenstein authored May 13, 2022

Summary: Stronger typing for these functions

Reviewed By: shapovalov

Differential Revision: D36170489

fbshipit-source-id: a2104b29dbbbcfcf91ae1d076cd6b0e3d2030c0b

2c190152

09 May, 2022 1 commit

Extracted ImplicitronModelBase and unified API for GenericModel and ModelDBIR · a6dada39

Roman Shapovalov authored May 09, 2022

Summary:
To avoid model_zoo, we need to make GenericModel pluggable.
I also align creation APIs for convenience.

Reviewed By: bottler, davnov134

Differential Revision: D35933093

fbshipit-source-id: 8228926528eb41a795fbfbe32304b8019197e2b1

a6dada39

24 Mar, 2022 1 commit

Using the new dataset idx API everywhere. · e2622d79

Roman Shapovalov authored Mar 24, 2022

Summary: Using the API from D35012121 everywhere.

Reviewed By: bottler

Differential Revision: D35045870

fbshipit-source-id: dab112b5e04160334859bbe8fa2366344b6e0f70

e2622d79

21 Mar, 2022 1 commit

implicitron v0 (#1133) · cdd2142d

Jeremy Reizenstein authored Mar 21, 2022


Co-authored-by: Jeremy Francis Reizenstein <bottler@users.noreply.github.com>

cdd2142d