loader_cores and compute_cores arguments (list of CPU cores) can be passed to *enable_cpu_affinity* for more
control over which cores should be used, e.g. in case a workload scales well across multiple NUMA nodes.
Depending on the dataset, model and CPU optimal number of dataloader workers and OpemMP threads may vary
but close to the general default advise presented above [#f4]_ .
Usage:
.. code:: python
dataloader = dgl.dataloading.DataLoader(...)
...
with dataloader.enable_cpu_affinity():
<training loop or inferencing>
**Manual control**
For advanced and more fine-grained control over OpenMP settings please refer to Maximize Performance of Intel® Optimization for PyTorch* on CPU [#f4]_ article