Unverified Commit 7be43390 authored by Kirthi Shankar Sivamani's avatar Kirthi Shankar Sivamani Committed by GitHub
Browse files

Fix README render for uploading package to PyPI (#1798)



* Fix README render on PyPI
Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>

* Update README.rst
Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>

* Use anonymous hyperlink for duplicate. Fix indent.
Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>

---------
Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>
parent 2645eaec
...@@ -146,7 +146,7 @@ Installation ...@@ -146,7 +146,7 @@ Installation
============ ============
System Requirements System Requirements
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
* **Hardware:** Blackwell, Hopper, Grace Hopper/Blackwell, Ada, Ampere * **Hardware:** Blackwell, Hopper, Grace Hopper/Blackwell, Ada, Ampere
...@@ -164,10 +164,10 @@ System Requirements ...@@ -164,10 +164,10 @@ System Requirements
* **Notes:** FP8 features require Compute Capability 8.9+ (Ada/Hopper/Blackwell) * **Notes:** FP8 features require Compute Capability 8.9+ (Ada/Hopper/Blackwell)
Installation Methods Installation Methods
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
Docker (Recommended) Docker (Recommended)
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^
The quickest way to get started with Transformer Engine is by using Docker images on The quickest way to get started with Transformer Engine is by using Docker images on
`NVIDIA GPU Cloud (NGC) Catalog <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>`_. `NVIDIA GPU Cloud (NGC) Catalog <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>`_.
...@@ -192,7 +192,7 @@ Where 25.04 (corresponding to April 2025 release) is the container version. ...@@ -192,7 +192,7 @@ Where 25.04 (corresponding to April 2025 release) is the container version.
* NGC PyTorch 23.08+ containers include FlashAttention-2 * NGC PyTorch 23.08+ containers include FlashAttention-2
pip Installation pip Installation
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
**Prerequisites for pip installation:** **Prerequisites for pip installation:**
...@@ -230,7 +230,7 @@ Source Installation ...@@ -230,7 +230,7 @@ Source Installation
`See the installation guide <https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/installation.html#installation-from-source>`_ `See the installation guide <https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/installation.html#installation-from-source>`_
Environment Variables Environment Variables
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^
These environment variables can be set before installation to customize the build process: These environment variables can be set before installation to customize the build process:
* **CUDA_PATH**: Path to CUDA installation * **CUDA_PATH**: Path to CUDA installation
...@@ -241,7 +241,7 @@ These environment variables can be set before installation to customize the buil ...@@ -241,7 +241,7 @@ These environment variables can be set before installation to customize the buil
* **NVTE_BUILD_THREADS_PER_JOB**: Control threads per build job * **NVTE_BUILD_THREADS_PER_JOB**: Control threads per build job
Compiling with FlashAttention Compiling with FlashAttention
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Transformer Engine supports both FlashAttention-2 and FlashAttention-3 in PyTorch for improved performance. FlashAttention-3 was added in release v1.11 and is prioritized over FlashAttention-2 when both are present in the environment. Transformer Engine supports both FlashAttention-2 and FlashAttention-3 in PyTorch for improved performance. FlashAttention-3 was added in release v1.11 and is prioritized over FlashAttention-2 when both are present in the environment.
You can verify which FlashAttention version is being used by setting these environment variables: You can verify which FlashAttention version is being used by setting these environment variables:
...@@ -253,8 +253,9 @@ You can verify which FlashAttention version is being used by setting these envir ...@@ -253,8 +253,9 @@ You can verify which FlashAttention version is being used by setting these envir
It is a known issue that FlashAttention-2 compilation is resource-intensive and requires a large amount of RAM (see `bug <https://github.com/Dao-AILab/flash-attention/issues/358>`_), which may lead to out of memory errors during the installation of Transformer Engine. Please try setting **MAX_JOBS=1** in the environment to circumvent the issue. It is a known issue that FlashAttention-2 compilation is resource-intensive and requires a large amount of RAM (see `bug <https://github.com/Dao-AILab/flash-attention/issues/358>`_), which may lead to out of memory errors during the installation of Transformer Engine. Please try setting **MAX_JOBS=1** in the environment to circumvent the issue.
.. troubleshooting-begin-marker-do-not-remove .. troubleshooting-begin-marker-do-not-remove
Troubleshooting Troubleshooting
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
**Common Issues and Solutions:** **Common Issues and Solutions:**
...@@ -388,7 +389,7 @@ Papers ...@@ -388,7 +389,7 @@ Papers
Videos Videos
====== ======
* `Stable and Scalable FP8 Deep Learning Training on Blackwell | GTC 2025 <https://www.nvidia.com/en-us/on-demand/session/gtc24-s62457/>`_ * `Stable and Scalable FP8 Deep Learning Training on Blackwell | GTC 2025 <https://www.nvidia.com/en-us/on-demand/session/gtc24-s62457/>`__
* `Blackwell Numerics for AI | GTC 2025 <https://www.nvidia.com/en-us/on-demand/session/gtc25-s72458/>`_ * `Blackwell Numerics for AI | GTC 2025 <https://www.nvidia.com/en-us/on-demand/session/gtc25-s72458/>`_
* `Building LLMs: Accelerating Pretraining of Foundational Models With FP8 Precision | GTC 2025 <https://www.nvidia.com/gtc/session-catalog/?regcode=no-ncid&ncid=no-ncid&tab.catalogallsessionstab=16566177511100015Kus&search=zoho#/session/1726152813607001vnYK>`_ * `Building LLMs: Accelerating Pretraining of Foundational Models With FP8 Precision | GTC 2025 <https://www.nvidia.com/gtc/session-catalog/?regcode=no-ncid&ncid=no-ncid&tab.catalogallsessionstab=16566177511100015Kus&search=zoho#/session/1726152813607001vnYK>`_
* `From FP8 LLM Training to Inference: Language AI at Scale | GTC 2025 <https://www.nvidia.com/en-us/on-demand/session/gtc25-s72799/>`_ * `From FP8 LLM Training to Inference: Language AI at Scale | GTC 2025 <https://www.nvidia.com/en-us/on-demand/session/gtc25-s72799/>`_
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment