update doc for perf_train_cpu_many, add intel mpi introduction (#18576)

* update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

update doc for perf_train_cpu_many, add intel mpi introduction (#18576)
* update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
3cdaea47 · Wang, Yi · GitHub · 46d09410 · 3cdaea47
Unverified Commit 3cdaea47 authored Aug 12, 2022 by Wang, Yi Committed by GitHub Aug 12, 2022
Show whitespace changes
Inline Side-by-side

Showing with 15 additions and 1 deletion

docs/source/en/perf_train_cpu_many.mdx docs/source/en/perf_train_cpu_many.mdx +15 -1

No files found.
--- a/docs/source/en/perf_train_cpu_many.mdx
+++ b/docs/source/en/perf_train_cpu_many.mdx
@@ -36,8 +36,22 @@ pip install oneccl_bind_pt=={pytorch_version} -f https://software.intel.com/ipex
 ```
 where `{pytorch_version}` should be your PyTorch version, for instance 1.12.0.
 Check more approaches for [oneccl_bind_pt installation](https://github.com/intel/torch-ccl).
+Versions of oneCCL and PyTorch must match.
-### Usage in Trainer
+## Intel® MPI library
+Use this standards-based MPI implementation to deliver flexible, efficient, scalable cluster messaging on Intel® architecture. This component is part of the Intel® oneAPI HPC Toolkit.
+It can be installed via [MPI](https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#mpi).
+Please set the environment by following command before using it.
+```
+source /opt/intel/oneapi/setvars.sh
+```
+The following "Usage in Trainer" takes mpirun in Intel® MPI library as an example.
+## Usage in Trainer
 To enable multi CPU distributed training in the Trainer with the ccl backend, users should add **`--xpu_backend ccl`** in the command arguments.
 Let's see an example with the [question-answering example](https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering)