Commit 07ddd262 authored by Fei Sun's avatar Fei Sun Committed by Facebook GitHub Bot
Browse files

Add NUMA binding

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/472

Add NUMA binding to d2go. It equally distributes the GPUs to the CPU sockets so that the CPU traffic, GPU to CPU traffic are all balanced. It helps the diffusion model training, but it is a general technique that can be applied to all models. We still want to manually enable it in each case though, until we are confident that it gives better performance and set it as a default.

NUMA binding is based on jspark1105's work D42827082. Full credit goes to him.

This diff does not enable the feature.

Reviewed By: newstzpz

Differential Revision: D43036817

fbshipit-source-id: fe67fd656ed3980f04bc81909cae7ba2527346fd
parent 8bb24bb0
......@@ -108,6 +108,9 @@ def _add_detectron2go_runner_default_cfg(_C: CN) -> None:
# Add FB specific configs
_add_detectron2go_runner_default_fb_cfg(_C)
# Specify whether to perform NUMA binding
_C.NUMA_BINDING = False
def _add_rcnn_default_config(_C: CN) -> None:
_C.EXPORT_CAFFE2 = CN()
......
......@@ -475,6 +475,17 @@ class Detectron2GoRunner(D2GoDataAPIMixIn, BaseRunner):
# if a model has input-dependent logic
attach_profilers(cfg, model)
if cfg.NUMA_BINDING is True:
import numa
num_gpus_per_node = comm.get_local_size()
num_sockets = numa.get_max_node() + 1
socket_id = torch.cuda.current_device() // (
max(num_gpus_per_node // num_sockets, 1)
)
node_mask = set([socket_id])
numa.bind(node_mask)
optimizer = self.build_optimizer(cfg, model)
scheduler = self.build_lr_scheduler(cfg, optimizer)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment