"git@developer.sourcefind.cn:renzhc/diffusers_dcu.git" did not exist on "11d22e0e809d1219a067ded8a18f7b0129fc58c7"
Unverified Commit 11a426ac authored by Shaden Smith's avatar Shaden Smith Committed by GitHub
Browse files

Pointing docs to hosted HTML files for core API. (#41)

parent 246a2844
...@@ -215,7 +215,7 @@ pre-defined learning rate schedule: ...@@ -215,7 +215,7 @@ pre-defined learning rate schedule:
* **Gradient Averaging**: in distributed data parallel training, `backward` * **Gradient Averaging**: in distributed data parallel training, `backward`
ensures that gradients are averaged across data parallel processes after ensures that gradients are averaged across data parallel processes after
training on an `effective_batch_size`. training on an `train_batch_size`.
* **Loss Scaling**: in FP16/mixed precision training, the DeepSpeed * **Loss Scaling**: in FP16/mixed precision training, the DeepSpeed
engine automatically handles scaling the loss to avoid precision loss in the engine automatically handles scaling the loss to avoid precision loss in the
...@@ -274,7 +274,7 @@ the `step` value is stored as part of the `client_sd`. ...@@ -274,7 +274,7 @@ the `step` value is stored as part of the `client_sd`.
DeepSpeed featureds can be enabled, disabled, or configured using a config JSON DeepSpeed featureds can be enabled, disabled, or configured using a config JSON
file that should be specified as `args.deepspeed_config`. A sample config file file that should be specified as `args.deepspeed_config`. A sample config file
is shown below. For a full set of features see [core API is shown below. For a full set of features see [core API
doc](../../API/core_api/core_api.md). doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html).
```json ```json
{ {
...@@ -363,11 +363,12 @@ deepspeed --include="worker-2:0,1" \ ...@@ -363,11 +363,12 @@ deepspeed --include="worker-2:0,1" \
## Further Reading ## Further Reading
| Article | Description | | Article | Description |
| ---------------------------------------------------------------- | -------------------------------------------- | | ---------------------------------------------------------------------------------------------- | -------------------------------------------- |
| [DeepSpeed Features](./docs/features.md) | DeepSpeed features | | [DeepSpeed Features](./docs/features.md) | DeepSpeed features |
| [CIFAR-10 Tutorial](./docs/tutorials/CIFAR-10.md) | Getting started with CIFAR-10 and DeepSpeed | | [CIFAR-10 Tutorial](./docs/tutorials/CIFAR-10.md) | Getting started with CIFAR-10 and DeepSpeed |
| [Megatron-LM Tutorial](./docs/tutorials/MegatronGPT2Tutorial.md) | Train GPT2 with DeepSpeed and Megatron-LM | | [Megatron-LM Tutorial](./docs/tutorials/MegatronGPT2Tutorial.md) | Train GPT2 with DeepSpeed and Megatron-LM |
| [API Documentation]( https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) | Generated DeepSpeed API documentation |
......
...@@ -124,19 +124,19 @@ The DeepSpeed core API consists of just a handful of methods: ...@@ -124,19 +124,19 @@ The DeepSpeed core API consists of just a handful of methods:
* checkpointing : `load_checkpoint` and `store_checkpoint` * checkpointing : `load_checkpoint` and `store_checkpoint`
DeepSpeed supports all the features described in this document, via the use of these API, DeepSpeed supports all the features described in this document, via the use of these API,
along with a `deepspeed_config` JSON file for enabling and disabling the features. Please along with a `deepspeed_config` JSON file for enabling and disabling the features.
see [core API doc](../../API/core_api/core_api.md) for more details. Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.
### Gradient Clipping ### Gradient Clipping
DeepSpeed handles gradient clipping under the hood based on the max gradient norm DeepSpeed handles gradient clipping under the hood based on the max gradient norm
specified by the user. See [core API doc](../../API/core_api/core_api.md) for more specified by the user.
details. Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.
### Automatic loss scaling with mixed precision ### Automatic loss scaling with mixed precision
DeepSpeed internally handles loss scaling for mixed precision training. The parameters DeepSpeed internally handles loss scaling for mixed precision training. The parameters
for loss scaling can be specified in the `deepspeed_config` JSON file. See [core API for loss scaling can be specified in the `deepspeed_config` JSON file.
doc](../../API/core_api/core_api.md) for more details. Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.
## Training Optimizers ## Training Optimizers
...@@ -169,12 +169,12 @@ more details see [ZeRO paper](https://arxiv.org/abs/1910.02054) . ...@@ -169,12 +169,12 @@ more details see [ZeRO paper](https://arxiv.org/abs/1910.02054) .
## Training Agnostic Checkpointing ## Training Agnostic Checkpointing
**TODO: API documentation**
DeepSpeed can simplify checkpointing for you regardless of whether you are using data DeepSpeed can simplify checkpointing for you regardless of whether you are using data
parallel training, model parallel training, mixed-precision training, a mix of these parallel training, model parallel training, mixed-precision training, a mix of these
three, or using the zero optimizer to enable larger model sizes. See the [getting three, or using the zero optimizer to enable larger model sizes.
started](../../Onboard/onboard/onboard.md) or [core API Please see the [Getting Started](../README.md#getting-started) guide
doc](../../API/core_api/core_api.md) for details. and the
[core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.
## Advanced parameter search ## Advanced parameter search
DeepSpeed supports multiple Learning Rate Schedules to enable faster convergence for DeepSpeed supports multiple Learning Rate Schedules to enable faster convergence for
...@@ -195,9 +195,10 @@ can automatically handle batch creation appropriately. ...@@ -195,9 +195,10 @@ can automatically handle batch creation appropriately.
## Performance Analysis and Debugging ## Performance Analysis and Debugging
For performance debugging, DeepSpeed can give you a detailed breakdown of the time spent For performance debugging, DeepSpeed can give you a detailed breakdown of the time spent
in different parts of the training with by simply enabling it in the `deepspeed_config` in different parts of the training with by simply enabling it in the `deepspeed_config`
file. See [core API doc](../../API/core_api/core_api.md). file.
Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.
```json ```json
{ {
"wallclock_breakdwon": true "wallclock_breakdown": true
} }
``` ```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment