Pointing docs to hosted HTML files for core API. (#41)

11a426ac · Shaden Smith · GitHub · 246a2844 · 11a426ac · 11a426ac
Unverified Commit 11a426ac authored Feb 07, 2020 by Shaden Smith Committed by GitHub Feb 07, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 21 additions and 19 deletions

README.md README.md +8 -7

docs/features.md docs/features.md +13 -12

No files found.
--- a/README.md
+++ b/README.md
@@ -215,7 +215,7 @@ pre-defined learning rate schedule:

 * **Gradient Averaging**: in distributed data parallel training, `backward`
  ensures that gradients are averaged across data parallel processes after
-  training on an `effective_batch_size`.
+  training on an `train_batch_size`.

 * **Loss Scaling**: in FP16/mixed precision training, the DeepSpeed
  engine automatically handles scaling the loss to avoid precision loss in the
@@ -274,7 +274,7 @@ the `step` value is stored as part of the `client_sd`.
 DeepSpeed featureds can be enabled, disabled, or configured using a config JSON
 file that should be specified as `args.deepspeed_config`. A sample config file
 is shown below. For a full set of features see [core API
-doc](../../API/core_api/core_api.md).
+doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html).

 ```json
 {
@@ -363,11 +363,12 @@ deepspeed --include="worker-2:0,1" \

 ## Further Reading

-| Article                                                          | Description                                  |
-| ---------------------------------------------------------------- | -------------------------------------------- |
-| [DeepSpeed Features](./docs/features.md)                         |  DeepSpeed features                          |
-| [CIFAR-10 Tutorial](./docs/tutorials/CIFAR-10.md)                |  Getting started with CIFAR-10 and DeepSpeed |
-| [Megatron-LM Tutorial](./docs/tutorials/MegatronGPT2Tutorial.md) |  Train GPT2 with DeepSpeed and Megatron-LM   |
+| Article                                                                                        | Description                                  |
+| ---------------------------------------------------------------------------------------------- | -------------------------------------------- |
+| [DeepSpeed Features](./docs/features.md)                                                       |  DeepSpeed features                          |
+| [CIFAR-10 Tutorial](./docs/tutorials/CIFAR-10.md)                                              |  Getting started with CIFAR-10 and DeepSpeed |
+| [Megatron-LM Tutorial](./docs/tutorials/MegatronGPT2Tutorial.md)                               |  Train GPT2 with DeepSpeed and Megatron-LM   |
+| [API Documentation]( https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) |  Generated DeepSpeed API documentation       |




--- a/docs/features.md
+++ b/docs/features.md
@@ -124,19 +124,19 @@ The DeepSpeed core API consists of just a handful of methods:
 * checkpointing : `load_checkpoint` and `store_checkpoint`

 DeepSpeed supports all the features described in this document, via the use of these API,
-along with a `deepspeed_config` JSON file for enabling and disabling the features. Please
-see [core API doc](../../API/core_api/core_api.md) for more details.
+along with a `deepspeed_config` JSON file for enabling and disabling the features.
+Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.


 ### Gradient Clipping
 DeepSpeed handles gradient clipping under the hood based on the max gradient norm
-specified by the user. See [core API doc](../../API/core_api/core_api.md) for more
-details.
+specified by the user.
+Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.

 ### Automatic loss scaling with mixed precision
 DeepSpeed internally handles loss scaling for mixed precision training. The parameters
-for loss scaling can be specified in the `deepspeed_config` JSON file. See [core API
-  doc](../../API/core_api/core_api.md) for more details.
+for loss scaling can be specified in the `deepspeed_config` JSON file.
+Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.

 ## Training Optimizers

@@ -169,12 +169,12 @@ more details see [ZeRO paper](https://arxiv.org/abs/1910.02054) .


 ## Training Agnostic Checkpointing
-**TODO: API documentation**
 DeepSpeed can simplify checkpointing for you regardless of whether you are using data
 parallel training, model parallel training, mixed-precision training, a mix of these
-three, or using the zero optimizer to enable larger model sizes. See the [getting
-started](../../Onboard/onboard/onboard.md) or [core API
-doc](../../API/core_api/core_api.md) for details.
+three, or using the zero optimizer to enable larger model sizes.
+Please see the [Getting Started](../README.md#getting-started) guide
+and the
+[core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.

 ## Advanced parameter search
 DeepSpeed supports multiple Learning Rate Schedules to enable faster convergence for
@@ -195,9 +195,10 @@ can automatically handle batch creation appropriately.
 ## Performance Analysis and Debugging
 For performance debugging, DeepSpeed can give you a detailed breakdown of the time spent
 in different parts of the training with by simply enabling it in the `deepspeed_config`
-file. See [core API doc](../../API/core_api/core_api.md).
+file.
+Please see the [core API doc](https://microsoft.github.io/DeepSpeed/docs/htmlfiles/api/full/index.html) for more details.
 ```json
 {
-  "wallclock_breakdwon": true
+  "wallclock_breakdown": true
 }
 ```