- 02 Aug, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: Remove the dataset's need to provide the task type. Reviewed By: davnov134, kjchalup Differential Revision: D38314000 fbshipit-source-id: 3805d885b5d4528abdc78c0da03247edb9abf3f7
-
- 01 Aug, 2022 1 commit
-
-
David Novotny authored
Summary: Currently, seeds are set only inside the train loop. But this does not ensure that the model weights are initialized the same way everywhere which makes all experiments irreproducible. This diff fixes it. Reviewed By: bottler Differential Revision: D38315840 fbshipit-source-id: 3d2ecebbc36072c2b68dd3cd8c5e30708e7dd808
-
- 30 Jul, 2022 1 commit
-
-
Krzysztof Chalupka authored
Summary: This large diff rewrites a significant portion of Implicitron's config hierarchy. The new hierarchy, and some of the default implementation classes, are as follows: ``` Experiment data_source: ImplicitronDataSource dataset_map_provider data_loader_map_provider model_factory: ImplicitronModelFactory model: GenericModel optimizer_factory: ImplicitronOptimizerFactory training_loop: ImplicitronTrainingLoop evaluator: ImplicitronEvaluator ``` 1) Experiment (used to be ExperimentConfig) is now a top-level Configurable and contains as members mainly (mostly new) high-level factory Configurables. 2) Experiment's job is to run factories, do some accelerate setup and then pass the results to the main training loop. 3) ImplicitronOptimizerFactory and ImplicitronModelFactory are new high-level factories that create the optimizer, scheduler, model, and stats objects. 4) TrainingLoop is a new configurable that runs the main training loop and the inner train-validate step. 5) Evaluator is a new configurable that TrainingLoop uses to run validation/test steps. 6) GenericModel is not the only model choice anymore. Instead, ImplicitronModelBase (by default instantiated with GenericModel) is a member of Experiment and can be easily replaced by a custom implementation by the user. All the new Configurables are children of ReplaceableBase, and can be easily replaced with custom implementations. In addition, I added support for the exponential LR schedule, updated the config files and the test, as well as added a config file that reproduces NERF results and a test to run the repro experiment. Reviewed By: bottler Differential Revision: D37723227 fbshipit-source-id: b36bee880d6aa53efdd2abfaae4489d8ab1e8a27
-
- 21 Jul, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: Avoid calculating all_train_cameras before it is needed, because it is slow in some datasets. Reviewed By: shapovalov Differential Revision: D38037157 fbshipit-source-id: 95461226655cde2626b680661951ab17ebb0ec75
-
- 17 Jul, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: For debugging, introduce PYTORCH3D_NO_ACCELERATE env var. Reviewed By: shapovalov Differential Revision: D37885393 fbshipit-source-id: de080080c0aa4b6d874028937083a0113bb97c23
-
- 13 Jul, 2022 1 commit
-
-
Roman Shapovalov authored
Summary: 1. Respecting `visdom_show_preds` parameter when it is False. 2. Clipping the images pre-visualisation, which is important for methods like SRN that are not arare of pixel value range. Reviewed By: bottler Differential Revision: D37786439 fbshipit-source-id: 8dbb5104290bcc5c2829716b663cae17edc911bd
-
- 12 Jul, 2022 1 commit
-
-
Nikhila Ravi authored
Summary: ## Changes: - Added Accelerate Library and refactored experiment.py to use it - Needed to move `init_optimizer` and `ExperimentConfig` to a separate file to be compatible with submitit/hydra - Needed to make some modifications to data loaders etc to work well with the accelerate ddp wrappers - Loading/saving checkpoints incorporates an unwrapping step so remove the ddp wrapped model ## Tests Tested with both `torchrun` and `submitit/hydra` on two gpus locally. Here are the commands: **Torchrun** Modules loaded: ```sh 1) anaconda3/2021.05 2) cuda/11.3 3) NCCL/2.9.8-3-cuda.11.3 4) gcc/5.2.0. (but unload gcc when using submit) ``` ```sh torchrun --nnodes=1 --nproc_per_node=2 experiment.py --config-path ./configs --config-name repro_singleseq_nerf_test ``` **Submitit/Hydra Local test** ```sh ~/pytorch3d/projects/implicitron_trainer$ HYDRA_FULL_ERROR=1 python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs hydra/launcher=submitit_local hydra.launcher.gpus_per_node=2 hydra.launcher.tasks_per_node=2 hydra.launcher.nodes=1 ``` **Submitit/Hydra distributed test** ```sh ~/implicitron/pytorch3d$ python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs hydra/launcher=submitit_slurm hydra.launcher.gpus_per_node=8 hydra.launcher.tasks_per_node=8 hydra.launcher.nodes=1 hydra.launcher.partition=learnlab hydra.launcher.timeout_min=4320 ``` ## TODOS: - Fix distributed evaluation: currently this doesn't work as the input format to the evaluation function is not suitable for gathering across gpus (needs to be nested list/tuple/dicts of objects that satisfy `is_torch_tensor`) and currently `frame_data` contains `Cameras` type. - Refactor the `accelerator` object to be accessible by all functions instead of needing to pass it around everywhere? Maybe have a `Trainer` class and add it as a method? - Update readme with installation instructions for accelerate and also commands for running jobs with torchrun and submitit/hydra X-link: https://github.com/fairinternal/pytorch3d/pull/37 Reviewed By: davnov134, kjchalup Differential Revision: D37543870 Pulled By: bottler fbshipit-source-id: be9eb4e91244d4fe3740d87dafec622ae1e0cf76
-
- 06 Jul, 2022 3 commits
-
-
Jeremy Reizenstein authored
Summary: As part of removing Task, move camera difficulty bin breaks from hard code to the top level. Reviewed By: davnov134 Differential Revision: D37491040 fbshipit-source-id: f2d6775ebc490f6f75020d13f37f6b588cc07a0b
-
Jeremy Reizenstein authored
Summary: Enable pyre checking of the trainer code. Reviewed By: shapovalov Differential Revision: D36545438 fbshipit-source-id: db1ea8d1ade2da79a2956964eb0c7ba302fa40d1
-
Jeremy Reizenstein authored
Summary: As part of removing Task, make the dataset code generate the source cameras for itself. There's a small optimization available here, in that the JsonIndexDataset could avoid loading images. Reviewed By: shapovalov Differential Revision: D37313423 fbshipit-source-id: 3e5e0b2aabbf9cc51f10547a3523e98c72ad8755
-
- 16 Jun, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: Copy code from NeRF for loading LLFF data and blender synthetic data, and create dataset objects for them Reviewed By: shapovalov Differential Revision: D35581039 fbshipit-source-id: af7a6f3e9a42499700693381b5b147c991f57e5d
-
- 10 Jun, 2022 2 commits
-
-
Jeremy Reizenstein authored
Summary: Add test that the yaml files deserialize. Reviewed By: davnov134 Differential Revision: D36830673 fbshipit-source-id: b785d8db97b676686036760bfa2dd3fa638bda57
-
Jeremy Reizenstein authored
Summary: Preparing for pluggables in experiment.py Reviewed By: davnov134 Differential Revision: D36830674 fbshipit-source-id: eab499d1bc19c690798fbf7da547544df7e88fa5
-
- 26 May, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: Add simple interactive testrunner for experiment.py Reviewed By: shapovalov Differential Revision: D35316221 fbshipit-source-id: d424bcba632eef89eefb56e18e536edb58ec6f85
-
- 25 May, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: The ImplicitronDataset class corresponds to JsonIndexDatasetMapProvider Reviewed By: shapovalov Differential Revision: D36661396 fbshipit-source-id: 80ca2ff81ef9ecc2e3d1f4e1cd14b6f66a7ec34d
-
- 20 May, 2022 4 commits
-
-
Jeremy Reizenstein authored
Summary: replace dataloader_zoo with a pluggable DataLoaderMapProvider. Reviewed By: shapovalov Differential Revision: D36475441 fbshipit-source-id: d16abb190d876940434329928f2e3f2794a25416
-
Jeremy Reizenstein authored
Summary: replace dataset_zoo with a pluggable DatasetMapProvider. The logic is now in annotated_file_dataset_map_provider. Reviewed By: shapovalov Differential Revision: D36443965 fbshipit-source-id: 9087649802810055e150b2fbfcc3c197a761f28a
-
Jeremy Reizenstein authored
Summary: Separate ImplicitronDatasetBase and FrameData (to be used by all data sources) from ImplicitronDataset (which is specific). Reviewed By: shapovalov Differential Revision: D36413111 fbshipit-source-id: 3725744cde2e08baa11aff4048237ba10c7efbc6
-
Jeremy Reizenstein authored
Summary: Move dataset_args and dataloader_args from ExperimentConfig into a new member called datasource so that it can contain replaceables. Also add enum Task for task type. Reviewed By: shapovalov Differential Revision: D36201719 fbshipit-source-id: 47d6967bfea3b7b146b6bbd1572e0457c9365871
-
- 13 May, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: Stronger typing for these functions Reviewed By: shapovalov Differential Revision: D36170489 fbshipit-source-id: a2104b29dbbbcfcf91ae1d076cd6b0e3d2030c0b
-
- 09 May, 2022 1 commit
-
-
Roman Shapovalov authored
Summary: To avoid model_zoo, we need to make GenericModel pluggable. I also align creation APIs for convenience. Reviewed By: bottler, davnov134 Differential Revision: D35933093 fbshipit-source-id: 8228926528eb41a795fbfbe32304b8019197e2b1
-
- 06 Apr, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: Try again to solve https://github.com/facebookresearch/pytorch3d/issues/1144 pickling problem. D35258561 (https://github.com/facebookresearch/pytorch3d/commit/24260130ce382640a89e978cd671298488ec6451) didn't work. When writing a function or vanilla class C which you want people to be able to call get_default_args on, you must add the line enable_get_default_args(C) to it. This causes autogeneration of a hidden dataclass in the module. Reviewed By: davnov134 Differential Revision: D35364410 fbshipit-source-id: 53f6e6fff43e7142ae18ca3b06de7d0c849ef965
-
- 31 Mar, 2022 1 commit
-
-
Jeremy Reizenstein authored
Summary: ListConfig and DictConfig members of get_default_args(X) when X is a callable will contain references to a temporary dataclass and therefore be unpicklable. Avoid this in a few cases. Fixes https://github.com/facebookresearch/pytorch3d/issues/1144 Reviewed By: shapovalov Differential Revision: D35258561 fbshipit-source-id: e52186825f52accee9a899e466967a4ff71b3d25
-
- 25 Mar, 2022 1 commit
-
-
Roman Shapovalov authored
Summary: Before the fix, running get_default_args(C: Callable) returns an unstructured DictConfig which causes Enums to be handled incorrectly. This is a fix. WIP update: Currently tests still fail whenever a function signature contains an untyped argument: This needs to be somehow fixed. Reviewed By: bottler Differential Revision: D34932124 fbshipit-source-id: ecdc45c738633cfea5caa7480ba4f790ece931e8
-
- 21 Mar, 2022 1 commit
-
-
Jeremy Reizenstein authored
Co-authored-by:Jeremy Francis Reizenstein <bottler@users.noreply.github.com>
-