Unverified Commit 828d75ba authored by Stas Bekman's avatar Stas Bekman Committed by GitHub
Browse files

document deepspeed.initialize() (#644)


Co-authored-by: default avatarJeff Rasley <jerasley@microsoft.com>
parent 4e2dc4e4
...@@ -31,6 +31,22 @@ construct and manage the training optimizer, data loader, and the learning rate ...@@ -31,6 +31,22 @@ construct and manage the training optimizer, data loader, and the learning rate
scheduler based on the parameters passed to `deepspeed.initialize` and the scheduler based on the parameters passed to `deepspeed.initialize` and the
DeepSpeed [configuration file](#deepspeed-configuration). DeepSpeed [configuration file](#deepspeed-configuration).
If you already have a distributed environment setup, you'd need to replace:
```python
torch.distributed.init_process_group(...)
```
with:
```python
deepspeed.init_distributed()
```
The default is to use the NCCL backend, which DeepSpeed has been thoroughly tested with, but you can also [override the default](https://deepspeed.readthedocs.io/en/latest/initialize.html#distributed-initialization).
But if you don't need the distributed environment setup until after `deepspeed.initialize()` you don't have to use this function, as DeepSpeed will automatically initialize the distributed environment during its `initialize`. Regardless, you will need to remove `torch.distributed.init_process_group` if you already had it in place.
### Training ### Training
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment