Minor doc tweaks (#761)

* Fix docstring * Make screenshots clickable for easier viewing * Navigation menu in alphabetical order; More clicable screenshots * Rename 1Cycle doc * Tweak naming

Minor doc tweaks (#761)
* Fix docstring * Make screenshots clickable for easier viewing * Navigation menu in alphabetical order; More clicable screenshots * Rename 1Cycle doc * Tweak naming
c28a71f9 · Olatunji Ruwase · GitHub · 7cab55c7 · c28a71f9 · c28a71f9
Unverified Commit c28a71f9 authored Feb 16, 2021 by Olatunji Ruwase Committed by GitHub Feb 16, 2021
4 changed files
--- a/docs/_config.yml
+++ b/docs/_config.yml
@@ -38,7 +38,7 @@ collections:
      - bert-finetuning.md
      - transformer_kernel.md
      - megatron.md
-      - 1Cycle.md
+      - one-cycle.md
      - lrrt.md
      - zero.md
      - flops-profiler.md

--- a/docs/_data/navigation.yml
+++ b/docs/_data/navigation.yml
@@ -58,35 +58,35 @@ lnav:
        url: /getting-started/
      - title: "Getting started on Azure"
        url: /tutorials/azure/
+      - title: "BingBertSQuAD Fine-tuning"
+        url: /tutorials/bert-finetuning/
+      - title: "BERT Pre-training"
+        url: /tutorials/bert-pretraining/
      - title: "CIFAR-10"
        url: /tutorials/cifar-10/
+      - title: "Flops Profiler"
+        url: /tutorials/flops-profiler/
      - title: "GAN"
        url: /tutorials/gan/
-      - title: "BERT Pre-training"
-        url: /tutorials/bert-pretraining/
-      - title: "BingBertSQuAD Fine-tuning"
-        url: /tutorials/bert-finetuning/
-      - title: "DeepSpeed Transformer Kernel"
-        url: /tutorials/transformer_kernel/
-      - title: "Megatron-LM GPT2"
-        url: /tutorials/megatron/
-      - title: "1-Cycle Schedule"
-        url: /tutorials/1Cycle/
      - title: "Learning Rate Range Test"
        url: /tutorials/lrrt/
-      - title: "DeepSpeed Sparse Attention"
+      - title: "Megatron-LM GPT2"
-        url: /tutorials/sparse-attention/
+        url: /tutorials/megatron/
-      - title: "ZeRO-Offload"
+      - title: "One-Cycle Schedule"
-        url: /tutorials/zero-offload/
+        url: /tutorials/one-cycle/
-      - title: "ZeRO Redundancy Optimizer (ZeRO)"
+      - title: "One-Bit Adam"
-        url: /tutorials/zero/
-      - title: "DeepSpeed with 1-bit Adam"
        url: /tutorials/onebit-adam/
      - title: "Pipeline Parallelism"
        url: /tutorials/pipeline/
      - title: "Progressive Layer Dropping"
        url: /tutorials/progressive_layer_dropping/
-      - title: "Flops Profiler"
+      - title: "Sparse Attention"
-        url: /tutorials/flops-profiler/
+        url: /tutorials/sparse-attention/
+      - title: "Transformer Kernel"
+        url: /tutorials/transformer_kernel/
+      - title: "ZeRO-Offload"
+        url: /tutorials/zero-offload/
+      - title: "ZeRO Redundancy Optimizer (ZeRO)"
+        url: /tutorials/zero/
  - title: "Contributing"
    url: /contributing/
--- a/docs/_tutorials/1Cycle.md
+++ b/docs/_tutorials/1Cycle.md
--- a/docs/_tutorials/zero-offload.md
+++ b/docs/_tutorials/zero-offload.md
@@ -49,19 +49,26 @@ ZeRO-Offload leverages much for ZeRO stage 2 mechanisms, and so the configuratio
 }
 ```
-As seen above, in addition to setting the _stage_ field to **2** (to enable ZeRO stage 2), we also need to set _cpu_offload_ flag to **true** enable ZeRO-Offload optimizations. In addition, we can  set other ZeRO stage 2 optimization flags, such as _overlap_comm_ to tune ZeRO-Offload performance.  With these changes we can now run the model. We share some screenshots of the training below.
+As seen above, in addition to setting the _stage_ field to **2** (to enable ZeRO stage 2), we also need to set _cpu_offload_ flag to **true** to enable ZeRO-Offload optimizations. In addition, we can  set other ZeRO stage 2 optimization flags, such as _overlap_comm_ to tune ZeRO-Offload performance.  With these changes we can now run the model. We share some screenshots of the training below.
 Here is a screenshot of the training log:
-![ZERO_OFFLOAD_DP1_10B_LOG](/assets/images/zero_offload_dp1_10B_log.png)
+<a href="/assets/images/zero_offload_dp1_10B_log.png">
+<img src="/assets/images/zero_offload_dp1_10B_log.png">
+</a>
 Here is a screenshot of nvidia-smi showing that only GPU 0 is active during training:
-![ZERO_OFFLOAD_DP1_10B_SMI](/assets/images/zero_offload_dp1_10B_smi.png)
+<a href="/assets/images/zero_offload_dp1_10B_smi.png">
+<img src="/assets/images/zero_offload_dp1_10B_smi.png">
+</a>
 Finally, here is a screenshot of htop showing host CPU and memory activity during optimizer computation:
-![ZERO_OFFLOAD_DP1_10B_SMI](/assets/images/zero_offload_dp1_10B_cpu.png)
+<a href="/assets/images/zero_offload_dp1_10B_cpu.png">
+<img src="/assets/images/zero_offload_dp1_10B_cpu.png">
+</a>
 Congratulations! You have completed the ZeRO-Offload tutorial.