- 02 Sep, 2021 1 commit
-
-
Yifan Xiong authored
__Description__ Resolve "too many open files" issue when runnning NCCL/RCCL on multiple nodes using Docker images, increase nofile number in limits.conf.
-
- 01 Sep, 2021 2 commits
- 31 Aug, 2021 1 commit
-
-
guoshzhao authored
**Description** Add dockerfile `rocm4.0-pytorch1.7.0.dockerfile` and `rocm4.2-pytorch1.7.0.dockerfile` for `rocm` platform.
-