- 30 Dec, 2021 1 commit
-
-
Yifan Xiong authored
__Description__ Cherry-pick bug fixes from v0.4.0 to main. __Major Revisions__ * Bug - Fix issues for Ansible and benchmarks (#267) * Tests - Refine test cases for microbenchmark (#268) * Bug - Build openmpi with ucx support in rocm dockerfiles (#269) * Benchmarks: Fix Bug - Fix fio build issue (#272) * Docs - Unify metric and add doc for cublas and cudnn functions (#271) * Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274) * Bug - Fix bug of detecting if gpu_index is none (#275) * Bug - Fix bugs in data diagnosis (#273) * Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270) * Benchmarks: Configuration - Update inference and network benchmarks in configs (#276) * Docs - Upgrade version and release note (#277) Co-authored-by:Yuting Jiang <v-yutjiang@microsoft.com>
-
- 02 Sep, 2021 1 commit
-
-
Yifan Xiong authored
__Description__ Fix inventory bug in ansible_runner when host list is provided with multiple hosts. It ought to be handled by ansible_runner lib, workaround by using `--inventory` arg in cmdline.
-
- 19 Aug, 2021 1 commit
-
-
Yifan Xiong authored
Support mpi mode in runner: * concate mpirun command * support mca and env config * prepare hostfile and update Ansible host pattern Co-authored-by:Peng Cheng <chengpeng5555@outlook.com>
-
- 23 May, 2021 1 commit
-
-
Yifan Xiong authored
Implement ansible client and runner: * add ansible client * add deploy and check_env playbooks
-