Allow to use TensorFloat32

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/403 `cfg.SOLVER.AMP.ENABLED` enabled mixed precision, but this only works for V100 GPUs. For A100s, the equivalent is to enable TF32. Reviewed By: tglik Differential Revision: D40675242 fbshipit-source-id: 5cc3d12cd3d7ec76665e0907ecc87fc5f64d73f0

Allow to use TensorFloat32
Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/403 `cfg.SOLVER.AMP.ENABLED` enabled mixed precision, but this only works for V100 GPUs. For A100s, the equivalent is to enable TF32. Reviewed By: tglik Differential Revision: D40675242 fbshipit-source-id: 5cc3d12cd3d7ec76665e0907ecc87fc5f64d73f0
6c1682f9 · Francisc Bungiu · Facebook GitHub Bot · b94b23ee · 6c1682f9
Commit 6c1682f9 authored Oct 31, 2022 by Francisc Bungiu Committed by Facebook GitHub Bot Oct 31, 2022
Show whitespace changes
Inline Side-by-side

Showing with 6 additions and 0 deletions

d2go/runner/default_runner.py d2go/runner/default_runner.py +6 -0

No files found.
--- a/d2go/runner/default_runner.py
+++ b/d2go/runner/default_runner.py
@@ -467,6 +467,12 @@ class Detectron2GoRunner(BaseRunner):
        trainer = (AMPTrainer if cfg.SOLVER.AMP.ENABLED else SimpleTrainer)(
            _get_model_with_abnormal_checker(model), data_loader, optimizer
        )
+        if cfg.SOLVER.AMP.ENABLED and torch.cuda.is_available():
+            # Allow to use the TensorFloat32 (TF32) tensor cores, available on A100 GPUs.
+            # For more details https://pytorch.org/docs/stable/notes/cuda.html#tf32-on-ampere.
+            torch.backends.cuda.matmul.allow_tf32 = True
+            torch.backends.cudnn.allow_tf32 = True
        trainer_hooks = self._get_trainer_hooks(
            cfg, model, optimizer, scheduler, periodic_checkpointer, trainer
        )