Commits · 1950242a2fbe6a080442a10a58426f842ee157c6 · OpenDAS / d2go

26 May, 2023 1 commit

Move iterate_module_named_parameters to utils · 1950242a

Ajinkya Deogade authored May 26, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/549

The `iterate_module_named_parameters` is used by the `optimizer` and `quantization`.
Let's move the `iterate_module_named_parameters` to a shared location `utils` to break the circular dependencies for the following diffs in the stack.

Reviewed By: tglik

Differential Revision: D45912066

fbshipit-source-id: bce5c5db3bbc1866f4da8662f7bd5908bfe30aad

1950242a

25 May, 2023 4 commits

Generic Reproducibility · edcdb731

Jiaxu Zhu authored May 25, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/548

As title, by setting
```
SOLVER.DETERMINISTIC = True
SEED = 42 # or other values
```
Training results are reproducible

Reviewed By: wat3rBro, rkaarimi

Differential Revision: D46174626

fbshipit-source-id: d6665b777376a176bd46a1286c3199ed0da26ae6

edcdb731

Config and Registry: create a separate buck target · 1accd414

Ajinkya Deogade authored May 25, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/546

Here we start modularizing the targets. I had to introduce some temporary hacks to break the circular dependency while keeping the diff atomic. There are some TODOs left at the end of the stack that are still WIP.

Reviewed By: tglik

Differential Revision: D45912076

fbshipit-source-id: 375f579fe749dd4a588908cdca7b76ba68f1048f

1accd414

Resolve relative import for modeldef · 34823153

Ajinkya Deogade authored May 25, 2023

Summary:
There is an issue with the relative import in the `__init__` file of modeldef that causes tests on GitHub CI to fail.
Specifically, the `FBNetV2ModelArch` is not correctly populated.
The internal CI does not detect such failures because we use the buck build system.
This diff fixes it.

Pull Request resolved: https://github.com/facebookresearch/d2go/pull/547

Reviewed By: patricksnape

Differential Revision: D46177424

fbshipit-source-id: 06b23b9b221c990cd15a2debff6def8cfb99743b

34823153

fix attribute mismatch for memory profiler · 99c65490

Anthony Chen authored May 24, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/544

The previous diff on memory profiler D45673764 doesn't pick up a config key name change and causes an attribute not found error. This diff fixes it and adds two unittests (one with gpu one without) for using memory profiler in runner

Reviewed By: wat3rBro

Differential Revision: D46114730

fbshipit-source-id: d066d435021983d90f4a75e0c88798a3aedcaf92

99c65490

24 May, 2023 1 commit

Expand relative imports to absolute versions · 2526b053

Ajinkya Deogade authored May 24, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/545

Expanding the relative imports to absolute ones helps the autodeps down the stack.

Reviewed By: tglik

Differential Revision: D45912074

fbshipit-source-id: d42c9756dde731504ee6fd0f93cf549d71157489

2526b053

22 May, 2023 1 commit

Add a GPU memory snapshot profiler in d2go · 20e18edc

Anthony Chen authored May 22, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/542

## Overview
Add an option to enable GPU memory snapshot profiler in d2go. The profiler is natively supported by Pytorch and is able to record stack traces associated with all CUDA memory allocation/free events, allowing users to understand which parts of code contribute to the memory bottleneck. It also provides a powerful interactive web tool to visualize memory utilization ordered by time:
{F978609840}
Each colored block represents an allocated cuda memory block. User can click on the block to see the corresponding python stack trace that allocates the block.

## d2go integration
This diff integrates the profiler as a hook controlled by config key `USE_MEMORY_PROFILER`. The profiler will log snapshots and web tools to the output directory. There are three places that logging could happen: start of training, during training and OOM. Please read the docstring of `D2GoGpuMemorySnapshot` for more information.

Reviewed By: tglik, jaconey

Differential Revision: D45673764

fbshipit-source-id: 8900484a2266d94421fe3ee7a85a4dea3a9f6b72

20e18edc

19 May, 2023 1 commit

another implementation of log_interval · 876c6756

Yanghan Wang authored May 19, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/543

The previous implementation:
> the problem is the ContextDecorator somehow swallows the exception in the wrapped function and just returns None.

This diff adds a test such that previous implementation would fail:
```
======================================================================
FAIL: test_log_interval_error_prop (d2go.tests.fb.test_utils_logging.TestUtilsLogging)
Make sure the log_interval can handle error propagation.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/ef4169ac7f95fb74/mobile-vision/d2go/tests/__init_tests__/init_tests#link-tree/d2go/tests/fb/test_utils_logging.py", line 152, in test_log_interval_error_prop
    foo(-1)
AssertionError: ValueError not raised

----------------------------------------------------------------------
Ran 1 test in 0.098s
```

The new version seems easier to understand and doesn't have the error swallowing.

Reviewed By: jaconey

Differential Revision: D46009938

fbshipit-source-id: 6b632deb513ab47c4d760f796bf49fc45eae3005

876c6756

18 May, 2023 1 commit

synchronize before dumping model configs · 55319e5d

Jiaxu Zhu authored May 17, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/541

The issue post https://fb.workplace.com/groups/277527419809135/permalink/1303604910534709/

The fix was suggested by the MV folks.

Reviewed By: dilinwang820, wat3rBro

Differential Revision: D45881863

fbshipit-source-id: b33345c4230067b78f27e7deb038c095d55f1360

55319e5d

16 May, 2023 1 commit

Training Reproducibility · c37ecd66

Jiaxu Zhu authored May 16, 2023

Summary:
X-link: https://github.com/facebookresearch/detectron2/pull/4955

Pull Request resolved: https://github.com/facebookresearch/d2go/pull/540

Allow users to launch deterministic training jobs. That is, using the same training config, users can get identical training results.

Reviewed By: dilinwang820

Differential Revision: D45370627

fbshipit-source-id: 88db388c992500b0d789b8341952502cd1f8f995

c37ecd66

12 May, 2023 1 commit

Add @log_interval to log function duration · 64a0e9a7

Jack Zhang authored May 12, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/538

We want to log interval to measure execution time for a function.

Reviewed By: wat3rBro

Differential Revision: D45751279

fbshipit-source-id: fe25d3fedd32f61b64e978881b6547d3bc1acb22

64a0e9a7

10 May, 2023 1 commit

Move print replacement to module level · 67aeb618

Mik Vyatskov authored May 10, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/537

For some reason numba cannot work with the print being overwritten by a local variable. However when the override is a module attribute, it seems to work.

Reviewed By: navsud

Differential Revision: D45730776

fbshipit-source-id: fee1288b1adb43f69fe7c4e43f4a8a750f0b98b4

67aeb618

08 May, 2023 1 commit

Quantize FBS model with 16bit FX Quantization · e3642005

Jiaxu Zhu authored May 08, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/531

As title, enable mixed precision FX quantization for FBS model.

This diff includes
1. Add `custom_prepare_fx` to the FBS d2go model to enable the FX quantization.
2. Add two new d2go config params `QUANTIZATION.ACT_BITS/QUANTIZATION.WEIGHTS`
3. Add `backend_config/qconfig_mapping` to d2go convert function calls.
4. Add an example FBS fx QAT config.

Reviewed By: ayushidalmia

Differential Revision: D45252545

fbshipit-source-id: 813b192fcdd66c17629490b8908ce8cd8534506a

e3642005

07 May, 2023 1 commit

Instrument checkpoints for FSDPCheckpointer · 859f0bb9

John Lee authored May 07, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/536

This diff insteuments checkpoints using signpost for FSDPCheckpointer using D44278485 as a reference

Reviewed By: miqueljubert

Differential Revision: D45524792

fbshipit-source-id: 9b7e004e6853141ee26d65ae11f79b1f5f5db0e6

859f0bb9

02 May, 2023 1 commit

Use FSDP.STATE_DICT_TYPE = SHARDED_STATE_DICT by default · 5ecbb174

Anthony Chen authored May 02, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/535

Use `FSDP.STATE_DICT_TYPE = SHARDED_STATE_DICT` for FSDP checkpointing by default.` FSDP.USE_LOCAL_STATE_DICT` will be deprecated in the future.

# Note
After the change, config usage of `FSDP.USE_LOCAL_STATE_DICT` will not be picked up by code: it will be superseded by the default type of FSDP.STATE_DICT_TYPE instead

Reviewed By: tglik

Differential Revision: D45413143

fbshipit-source-id: e7bc2d5dc04ac09004cb89353333be020a9c80b5

5ecbb174

01 May, 2023 3 commits

Replace hasattr with getattr in mobile-vision/d2go/d2go/utils/abnormal_checker.py · bbb792d3

Richard Barnes authored Apr 30, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/533

The pattern
```
X.Y if hasattr(X, "Y") else Z
```
can be replaced with
```
getattr(X, "Y", Z)
```

The [getattr](https://www.w3schools.com/python/ref_func_getattr.asp) function gives more succinct code than the [hasattr](https://www.w3schools.com/python/ref_func_hasattr.asp) function. Please use it when appropriate.

**This diff is very low risk. Green tests indicate that you can safely Accept & Ship.**

Differential Revision: D44886687

fbshipit-source-id: f3f0265251bf8008ae927b767da5749bf6828c2c

bbb792d3

support visualizing panoptic segmentation prediction · 18bd89b2

Zhicheng Yan authored Apr 30, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/532

Enable the visualization of panoptic segmentation.

Reviewed By: tglik

Differential Revision: D45334039

fbshipit-source-id: eebd9316d56d8132a5d3c166058ae18a0e88e928

18bd89b2

Add logging for checkpointer type, distributed mode, and checkpointing mode in d2go · a536c85b

Anthony Chen authored Apr 30, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/534

Currently, d2go supports 2 checkpointers, 2 distributed modes and 3 checkpointing modes. The many options make it hard to maintain and manage all use cases. For example, after the recent migration to FSDP sharded_state_dict, it's hard to understand and trace down the usage of the deprecated version.

Per crassirostris and wat3rBro's advice, this diff add API loggings to better keep track of checkpointer usage in d2go.

## Appendix
2 checkpointers: FSDPCheckpointer, AIInfraCheckpointer
2 distributed modes: ddp, fsdp
3 checkpointing modes (fsdp only): local_state_dict, sharded_state_dict, full_state_dict

Reviewed By: tglik

Differential Revision: D45385021

fbshipit-source-id: 5d2cb115ed0fdada254b819793e376e410ecd97d

a536c85b

21 Apr, 2023 1 commit

enable the diffusion visualization evaluators to run on multiple datasets · c7bd7dfe

Tao Xu authored Apr 21, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/527

- Add model.reset_generation_counter() to enable the diffusion visualization evaluators to run on multiple test datasets.
  - Before this fix, the visualization evaluators will only run on the 1st test dataset since self.generation_counter will set to <0 after running on the 1st test datasaet. Thus the visualization evaluators will skip for all the other test sets since self.generation_counter < 0.
- Use the ddim for upsampler by default for better results

Reviewed By: zechenghe

Differential Revision: D45058672

fbshipit-source-id: 2f7919bf6ecd2e5f6f242ce3e7891cb3dc8d6af4

c7bd7dfe

20 Apr, 2023 2 commits

add options to exclude buffers and frozen parameters in EMA · d032c02c

Anthony Chen authored Apr 20, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/530

Add options to include/exclude model buffers and frozen parameters in EMA state via two new config keys `MODEL_EMA.INCLUDE_FROZEN` and `MODEL_EMA.INCLUDE_BUFFER`

Reviewed By: tglik

Differential Revision: D45129625

fbshipit-source-id: 895ebe7e4f8e15566c3c3bddd852dd98c40a27b1

d032c02c

Enable configuration of async write metrics · 3639b43c

Tsahi Glik authored Apr 20, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/529

Set config param for enabling async write metrics added in D44305165

Use it in LDM Pokemon config as first use case

Reviewed By: sf-wind

Differential Revision: D44335491

fbshipit-source-id: b000502e6ed0e19a10d6fe3a7470bcd3045e7717

3639b43c

18 Apr, 2023 1 commit

Add the missing optimizer argument · feb74214

Chien-Chin Huang authored Apr 18, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/528

Not passing optimizer object to shard_full_optim_state_dict() is being deprecated. This diff passes optimizer to shard_full_optim_state_dict().

Reviewed By: YanjunChen329

Differential Revision: D45065185

fbshipit-source-id: 0abec3eeff6e7c626eefc432c73e38779a6f02d9

feb74214

11 Apr, 2023 2 commits

Set gradient as bucket view · 8353ad23

Fei Sun authored Apr 10, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/526

Add a config variable: DDP_GRADIENT_AS_BUCKET_VIEW. Pass it to DDP. This variable reduces the memory consumption of the model.

Reviewed By: tglik

Differential Revision: D44273339

fbshipit-source-id: 272e2ffbea89532a55df0ebdb3bd49f0df7d78a5

8353ad23

Pass the zero_grad_before_forward flag to the trainer · 04a2956d

Fei Sun authored Apr 10, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/525

In d2go, pass the argument ZERO_GRAD_BEFORE_FORWARD to the detectron runtime.

Reviewed By: tglik

Differential Revision: D44267319

fbshipit-source-id: 3bd5874bea96ac381fb49972a2dfe9bb52005a7d

04a2956d

05 Apr, 2023 2 commits

Setup root logger once & on import time · abdeafb0

Mik Vyatskov authored Apr 05, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/523

To avoid setting it up multiple times, add run_once() decorator.

Additionally make sure logging is configured for datalodaing workers, which have a different entry point, by moving setting up logging to the import time. Right now when a dataloader worker is created using spawn method from multiprocessing module, a new Python interpreter is created, with all the modules imported anew and with the entry point set to the method specified. This means that the entry point of the training framework is skipped, together with the logging setup.

With this change, the logging is configured on the import time, which means that when a dataloading process is created, even though the training main is not invoked, the logging is still configured because even though train_net is not invoked as an entry point, it's still imported in the child process.

Reviewed By: miqueljubert

Differential Revision: D44641142

fbshipit-source-id: 06ea85363d965b31d7f9ade3c2615ed9db67470b

abdeafb0

change default FSDP strategy to grad_optim (ZERO2) · 35affd74

Anthony Chen authored Apr 04, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/522

Change d2go's default FSDP sharding strategy to grad_optim, which corresponds to ShardingStrategy.SHARD_GRAD_OP in FSDP API, or ZERO2 in literature. grad_optim is shown to have the best tradeoff between memory utilization and training speed for mid-sized models.

`FSDP.ALGORITHM = ""` was from the previous design to indicate that no FSDP is used. It will not work now

Reviewed By: tglik

Differential Revision: D44657184

fbshipit-source-id: 3888eea5f2b5042269e69453f3cdd8db7cf1581c

35affd74

03 Apr, 2023 1 commit

Docs: --output-dir ./ should follow --predictor-type in d2go.exporter · ed671e34

Grisha Temchenko authored Apr 03, 2023

Summary:
Correction in docs.
Related issue: https://github.com/facebookresearch/d2go/issues/514

Pull Request resolved: https://github.com/facebookresearch/d2go/pull/515

Reviewed By: crassirostris

Differential Revision: D44546569

fbshipit-source-id: fec3797bad15b55833d9278c19978ff9c312d963

ed671e34

31 Mar, 2023 2 commits

Set the default logging level to info · f7c06d8f

Mik Vyatskov authored Mar 31, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/521

Further along the setup, D2Go loggers will have logging level set to debug. Setting logging level as debug for every process introduces unnecessary logs.

Reviewed By: miqueljubert

Differential Revision: D44561105

fbshipit-source-id: 536f75bb886aec644207933e9baeb91a862a7ca7

f7c06d8f

Configure the logging separately from the setup · d1d835a1

Mik Vyatskov authored Mar 31, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/510

This change allows to more granularly configure initial logging setup as part of a separate module.

Reviewed By: tglik

Differential Revision: D44278485

fbshipit-source-id: 2f421ee4e7f9017ef8ebccb9ff51f4177b8628b9

d1d835a1

30 Mar, 2023 4 commits

Update AIInfraCheckpointer to use the new gather/scatter functions for EMA and optimizer states · fe8680c1

David Yan authored Mar 30, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/520

- Move gather/scatter functions to their own util function
- Make changes to onboard AIInfraCheckpointer to the gather/scatter functions for optimizer and ema state
- Add a test for fsdp checkpointer and ai infra checkpointer

Reviewed By: YanjunChen329

Differential Revision: D44400633

fbshipit-source-id: bcfe3e0a4fbf53f91a83e88f74c4538699a50293

fe8680c1

Save and load model EMA state for sharded state dicts in FSDPCheckpointer · e7652751

David Yan authored Mar 30, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/519

Prior to this, FSDP checkpointer did not save EMA state which matched the model state when the model used sharded state dict. This diff adds this functionality.

Reviewed By: YanjunChen329

Differential Revision: D44270790

fbshipit-source-id: f522765ad56e8279f355c43a19f26c3b6bcf01e3

e7652751

Run zoomer profiling · 67267821

Mircea Cimpoi authored Mar 30, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/518

Enable profiling for eval step only, not on every eval (which can be called during training)

Reviewed By: frabu6

Differential Revision: D44535915

fbshipit-source-id: 4497a3f74f5d751277df9ed41bc9bf21056341c4

67267821

Read metadata for actual dataset in Visualizer, if available · c4b2d09c

Anton Rigner authored Mar 30, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/516

# Context

D2go allows for training with more than one datasets, and as long as the categories are consistent, the IDs do not necessarily have to correspond to each other between annotations of two different data sets.

It is still loaded correctly to the data loader, and the training works as expected.

# Problem

However, I observed weird mis-labelleing issues in the Visualizer for Tensorboard. Originally I thought this was a data/conversion issue, but upon inspecting the logs I see that the data is loaded correctly. See example below.

{F924075931}

"Plant" labelled as "Refrigerator", "Floor" labelled as "Lamp"

{F924078113}

... but the loaded annotations doesn't actually contain any samples of "Refrigerator".

The reason is that the Visualizer always loads the metadata (and thus the labels) from the first train data set, but the order of the categories between the data sets may not be consistent, but still be a valid training run.

# Fix

If there is a data set name associated with the data to visualize, use that to fetch the metadata, and the correct labels, otherwise default to the first data set (current situation).

Reviewed By: wat3rBro

Differential Revision:
D44495363

Privacy Context Container: L1127277

fbshipit-source-id: 37b940d393aa794cd2f39aabdc66c6d23abd8000

c4b2d09c

26 Mar, 2023 1 commit

support specifying backend for testing helper. · bb34a375

Peizhao Zhang authored Mar 26, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/513

support specifying backend for testing helper.

Reviewed By: tglik

Differential Revision: D44401470

fbshipit-source-id: 9c7962cf40d3c677f9a3c7bfa9cdf5dcecae2ba9

bb34a375

24 Mar, 2023 2 commits

Add tests for sharded_state_dict and fix compatibility problems · 46606a02

David Yan authored Mar 23, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/511

Add tests for sharded_state_dict integration in AIF Checkpointer

Fix compatibility problems including:
1. small API errors of flatten_sharded_optim_state_dict
2. deprecate model.use_local_state_dict and model.load_local_state_dict
3. fix auto conversion for local_state_dict
4. fix T148056077: add metadata to differentiate between local_state_dict and sharded_state_dict when loading a directory with FSDPCheckpointer

Reviewed By: YanjunChen329

Differential Revision: D44160045

fbshipit-source-id: f607b7076d0e49b9407f9adfbc8ecfe439c3b0c9

46606a02

Add support for FSDP SHARDED_STATE_DICT in D2Go · fbc1c2e8

David Yan authored Mar 23, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/512

Currently, when saving and loading checkpoints for FSDP-wrapped modules, we are saving and loading using `StateDictType.LOCAL_STATE_DICT`, where the state_dict becomes essentially a single flat tensor under the `_flat_param` key (or some other layer-specific key for flat weights). This means that
1. It's impossible to load weights directly from checkpoints, for example in notebooks
2. Converting from a local to a global checkpoint requires running a special workflow (https://fburl.com/code/6yqa4ldb) that occupies the same number of GPUs as was used during training

This diff adds an option, `FSDP.STATE_DICT_TYPE`, which allows selection of the type of state dict to save (local, sharded, full). In sharded mode, with AIF checkpointing, we are able to have the benefit of allowing local loading of state dicts in minutes with any number of GPUs, in notebooks and elsewhere.

Note: for backwards compatibility, `CFG.FSDP.use_local_state_dict` and `CFG.FSDP.load_local_state_dict` still need to work when the new config parameter (`CFG.FSDP.state_dict_type`) is not set. Also, it's used to signify that local/sharded state dicts need to be converted to a full state dict when loading. This functionality can be deprecated when everyone migrates to AIF checkpointing with sharded dicts.

Reviewed By: YanjunChen329

Differential Revision: D43840887

fbshipit-source-id: d112f7b7ad97ba82fd5bf1da986b95ad7fc61c42

fbc1c2e8

23 Mar, 2023 1 commit

Redirect prints to logging module · d912e9f8

Mik Vyatskov authored Mar 23, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/509

print function is used all over the place and it's not realistic to enforce not using print for everyone. So this diff attempts to improve the debuggability of the code that was written using prints by redirecting prints to the logging module.

Additionally call logger setup from `setup_after_launch` to make sure logging settings are applied in every of the spawned processes.

Reviewed By: frabu6, wat3rBro

Differential Revision: D44280241

fbshipit-source-id: 713400ac2b2edacef3c7a99067cbb1e684c3c5ad

d912e9f8

22 Mar, 2023 2 commits

Make `trainer` class check less restrictive · 647e87ef

Mircea Cimpoi authored Mar 22, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/508

Avoid unnecessary restriction to base class Trainer. Subclasses of `SimpleTrainer` would work as well.

Reviewed By: wat3rBro

Differential Revision: D44221069

fbshipit-source-id: a666977b2073b4525b4c6940c121f6b05466e5d7

647e87ef

improve error message in batch transform · 232dad22

Yanghan Wang authored Mar 21, 2023

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/507

Reviewed By: crassirostris

Differential Revision: D44269996

fbshipit-source-id: 91b313aeb820ec39e60c29c4c1bd9e669e1f7a6b

232dad22

21 Mar, 2023 1 commit

Adds GPU testing for fused optimizers · 8a21c18b

Denis Savenkov authored Mar 21, 2023

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/505

Fused optimizers can only be run on CUDA, so this change makes necessary changes to enable remote execution for GPU tests, following: https://www.internalfb.com/intern/wiki/Pytorch_Ecosystem_Foundation_(EcoF)/PyTorch_Training/PyTorch_Lightning/Getting_Started/Testing/Adding_GPU_Unit_tests_using_RE/

Reviewed By: ertrue

Differential Revision: D44113380

fbshipit-source-id: 34a06813a894f4de6e5731f78ef7f2cf11f18a06

8a21c18b