Update getting_started.rst (#1188)
Summary: Hi,
I think there is a minor mistake in the doc. --distributed-no-spawn argument is needed for distributed training on multiple machines without slurm. Otherwise, the program will start 8 jobs on each GPU, when nproc_per_node=8.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1188
Differential Revision: D17627778
Pulled By: myleott
fbshipit-source-id: 35ab6b650dc1132d7cb2d150e80d2ebf0caf3e69