chore: Fix multiple typos (#613)

Signed-off-by: hugo-syn <hugo.vincent@synacktiv.com>

chore: Fix multiple typos (#613)
Signed-off-by: hugo-syn <hugo.vincent@synacktiv.com>
b4b8ae7b · hugo-syn · GitHub · 36047fd7 · b4b8ae7b · b4b8ae7b
Unverified Commit b4b8ae7b authored Jan 19, 2024 by hugo-syn Committed by GitHub Jan 19, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 2 deletions

docs/examples/advanced_optimizations.ipynb docs/examples/advanced_optimizations.ipynb +1 -1

docs/examples/quickstart.ipynb docs/examples/quickstart.ipynb +1 -1

No files found.
--- a/docs/examples/advanced_optimizations.ipynb
+++ b/docs/examples/advanced_optimizations.ipynb
@@ -199,7 +199,7 @@
    "\n",
    "</div>\n",
    "\n",
-    "PyTorch's autograd functionality assumes that a model parameter and its corresponding gradient have the same data type. However, while low-precision data types like FP8 are sufficient for evaluating a neural network's forward and backward passes, the optimization step typically requires full FP32 precision to avoid signficant learning degradation. In addition, Tensor Cores on Hopper GPUs have the option to accumulate matrix products directly into FP32, resulting in better numerical accuracy and avoiding the need for a separate casting kernel. Thus, Transformer Engine provides an option to directly generate FP32 gradients for weight tensors. The FP32 gradients are not output to the parameter's `grad` tensor, but rather to a `main_grad` tensor that must be initialized before the backward pass."
+    "PyTorch's autograd functionality assumes that a model parameter and its corresponding gradient have the same data type. However, while low-precision data types like FP8 are sufficient for evaluating a neural network's forward and backward passes, the optimization step typically requires full FP32 precision to avoid significant learning degradation. In addition, Tensor Cores on Hopper GPUs have the option to accumulate matrix products directly into FP32, resulting in better numerical accuracy and avoiding the need for a separate casting kernel. Thus, Transformer Engine provides an option to directly generate FP32 gradients for weight tensors. The FP32 gradients are not output to the parameter's `grad` tensor, but rather to a `main_grad` tensor that must be initialized before the backward pass."
   ]
  },
  {

--- a/docs/examples/quickstart.ipynb
+++ b/docs/examples/quickstart.ipynb
@@ -9,7 +9,7 @@
    "\n",
    "## Overview\n",
    "\n",
-    "Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, providing better performance with lower memory utilization in both training and inference. It provides support for 8-bit floating point (FP8) precision on Hopper GPUs, implements a collection of highly optimized building blocks for popular Transformer architectures, and exposes an automatic-mixed-precision-like API that can be used seamlessy with your PyTorch code. It also includes a framework-agnostic C++ API that can be integrated with other deep learning libraries to enable FP8 support for Transformers.\n",
+    "Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, providing better performance with lower memory utilization in both training and inference. It provides support for 8-bit floating point (FP8) precision on Hopper GPUs, implements a collection of highly optimized building blocks for popular Transformer architectures, and exposes an automatic-mixed-precision-like API that can be used seamlessly with your PyTorch code. It also includes a framework-agnostic C++ API that can be integrated with other deep learning libraries to enable FP8 support for Transformers.\n",
    "\n",
    "## Let's build a Transformer layer!\n",
    "\n",