"...git@developer.sourcefind.cn:jerrrrry/infinicore.git" did not exist on "1cadb2a1c689b3e61204945dde5da7614cc870b4"
Unverified Commit b4b8ae7b authored by hugo-syn's avatar hugo-syn Committed by GitHub
Browse files

chore: Fix multiple typos (#613)


Signed-off-by: default avatarhugo-syn <hugo.vincent@synacktiv.com>
parent 36047fd7
...@@ -199,7 +199,7 @@ ...@@ -199,7 +199,7 @@
"\n", "\n",
"</div>\n", "</div>\n",
"\n", "\n",
"PyTorch's autograd functionality assumes that a model parameter and its corresponding gradient have the same data type. However, while low-precision data types like FP8 are sufficient for evaluating a neural network's forward and backward passes, the optimization step typically requires full FP32 precision to avoid signficant learning degradation. In addition, Tensor Cores on Hopper GPUs have the option to accumulate matrix products directly into FP32, resulting in better numerical accuracy and avoiding the need for a separate casting kernel. Thus, Transformer Engine provides an option to directly generate FP32 gradients for weight tensors. The FP32 gradients are not output to the parameter's `grad` tensor, but rather to a `main_grad` tensor that must be initialized before the backward pass." "PyTorch's autograd functionality assumes that a model parameter and its corresponding gradient have the same data type. However, while low-precision data types like FP8 are sufficient for evaluating a neural network's forward and backward passes, the optimization step typically requires full FP32 precision to avoid significant learning degradation. In addition, Tensor Cores on Hopper GPUs have the option to accumulate matrix products directly into FP32, resulting in better numerical accuracy and avoiding the need for a separate casting kernel. Thus, Transformer Engine provides an option to directly generate FP32 gradients for weight tensors. The FP32 gradients are not output to the parameter's `grad` tensor, but rather to a `main_grad` tensor that must be initialized before the backward pass."
] ]
}, },
{ {
......
...@@ -9,7 +9,7 @@ ...@@ -9,7 +9,7 @@
"\n", "\n",
"## Overview\n", "## Overview\n",
"\n", "\n",
"Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, providing better performance with lower memory utilization in both training and inference. It provides support for 8-bit floating point (FP8) precision on Hopper GPUs, implements a collection of highly optimized building blocks for popular Transformer architectures, and exposes an automatic-mixed-precision-like API that can be used seamlessy with your PyTorch code. It also includes a framework-agnostic C++ API that can be integrated with other deep learning libraries to enable FP8 support for Transformers.\n", "Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, providing better performance with lower memory utilization in both training and inference. It provides support for 8-bit floating point (FP8) precision on Hopper GPUs, implements a collection of highly optimized building blocks for popular Transformer architectures, and exposes an automatic-mixed-precision-like API that can be used seamlessly with your PyTorch code. It also includes a framework-agnostic C++ API that can be integrated with other deep learning libraries to enable FP8 support for Transformers.\n",
"\n", "\n",
"## Let's build a Transformer layer!\n", "## Let's build a Transformer layer!\n",
"\n", "\n",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment