consolidate deterministic settings

Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/644 This diff consolidates deterministic settings in D2Go. In the `default_runner.py` file, the `torch.set_float32_matmul_precision("highest")` function is added to set the precision for matrix multiplication to the highest possible value. In the `setup.py` file, the `torch.backends.cudnn.deterministic` setting is set to `True` and the `torch.backends.cudnn.allow_tf32` setting is set to `False` to avoid random pytorch and CUDA algorithms during the training. The `torch.backends.cuda.matmul.allow_tf32` setting is also set to `False` to avoid random matrix multiplication algorithms. Additionally, the `seed` function is used to set the seed for reproducibility. Reviewed By: wat3rBro Differential Revision: D51796739 fbshipit-source-id: 50e44ea50b0311b56a885db9f633491ac3002bd4

consolidate deterministic settings
Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/644 This diff consolidates deterministic settings in D2Go. In the `default_runner.py` file, the `torch.set_float32_matmul_precision("highest")` function is added to set the precision for matrix multiplication to the highest possible value. In the `setup.py` file, the `torch.backends.cudnn.deterministic` setting is set to `True` and the `torch.backends.cudnn.allow_tf32` setting is set to `False` to avoid random pytorch and CUDA algorithms during the training. The `torch.backends.cuda.matmul.allow_tf32` setting is also set to `False` to avoid random matrix multiplication algorithms. Additionally, the `seed` function is used to set the seed for reproducibility. Reviewed By: wat3rBro Differential Revision: D51796739 fbshipit-source-id: 50e44ea50b0311b56a885db9f633491ac3002bd4
573bd454 · Kapil Krishnakumar · Facebook GitHub Bot · 94cf5068 · 573bd454 · 573bd454
Commit 573bd454 authored Jan 11, 2024 by Kapil Krishnakumar Committed by Facebook GitHub Bot Jan 11, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 19 additions and 1 deletion

d2go/runner/default_runner.py d2go/runner/default_runner.py +1 -0

d2go/setup.py d2go/setup.py +17 -1

setup.py setup.py +1 -0

No files found.
--- a/d2go/runner/default_runner.py
+++ b/d2go/runner/default_runner.py
@@ -655,6 +655,7 @@ class Detectron2GoRunner(D2GoDataAPIMixIn, BaseRunner):
                torch.backends.cuda.matmul.allow_tf32 = True
                torch.backends.cudnn.allow_tf32 = True
            elif cfg.SOLVER.DETERMINISTIC:
+                torch.set_float32_matmul_precision("highest")
                torch.backends.cuda.matmul.allow_tf32 = False
                torch.backends.cudnn.allow_tf32 = False


--- a/d2go/setup.py
+++ b/d2go/setup.py
@@ -37,6 +37,12 @@ from detectron2.utils.logger import setup_logger as _setup_logger
 from detectron2.utils.serialize import PicklableWrapper
 from mobile_cv.common.misc.py import FolderLock, MultiprocessingPdb, post_mortem_if_fail

+# @manual=//torchtnt/utils:device
+from torchtnt.utils.device import set_float32_precision
+
+# @manual=//torchtnt/utils:env
+from torchtnt.utils.env import seed
+
 logger = logging.getLogger(__name__)

 _RT = TypeVar("_RT")
@@ -324,13 +330,23 @@ def setup_after_launch(
    # avoid random pytorch and CUDA algorithms during the training
    if cfg.SOLVER.DETERMINISTIC:
        logging.warning("Using deterministic training for the reproducibility")
+
+        # tf32
+        set_float32_precision("highest")
+        torch.backends.cuda.matmul.allow_tf32 = False
+        torch.backends.cudnn.allow_tf32 = False
+
+        # seed
+        seed(cfg.SEED, deterministic=2)
+
+        # pytorch deterministic
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False
        torch.use_deterministic_algorithms(True)
        # reference: https://docs.nvidia.com/cuda/cublas/index.html#cublasApi_reproducibility
        os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"

-    if cfg.SEED > 0:
+    elif cfg.SEED > 0:
        seed_all_rng(cfg.SEED)

    return runner

--- a/setup.py
+++ b/setup.py
@@ -34,6 +34,7 @@ requirements = [
    # https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
    # https://github.com/protocolbuffers/protobuf/issues/10051
    "protobuf==3.20.2",
+    "torchtnt",
 ]