docs: Document NVTE_CUDA_ARCHS environment variable in README (#2414)

Add:: NVTE_CUDA_ARCHS to README Signed-off-by: Shoval Atias <satias@satias-mlt.client.nvidia.com> Co-authored-by: Shoval Atias <satias@satias-mlt.client.nvidia.com>

docs: Document NVTE_CUDA_ARCHS environment variable in README (#2414)
Add:: NVTE_CUDA_ARCHS to README Signed-off-by: Shoval Atias <satias@satias-mlt.client.nvidia.com> Co-authored-by: Shoval Atias <satias@satias-mlt.client.nvidia.com>
f612b749 · satias10 · GitHub · 0056b981 · f612b749
Unverified Commit f612b749 authored Nov 25, 2025 by satias10 Committed by GitHub Nov 25, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 0 deletions

README.rst README.rst +1 -0

No files found.
--- a/README.rst
+++ b/README.rst
@@ -259,6 +259,7 @@ These environment variables can be set before installation to customize the buil
 * **NVTE_FRAMEWORK**: Comma-separated list of frameworks to build for (e.g., ``pytorch,jax``)
 * **MAX_JOBS**: Limit number of parallel build jobs (default varies by system)
 * **NVTE_BUILD_THREADS_PER_JOB**: Control threads per build job
+* **NVTE_CUDA_ARCHS**: Semicolon-separated list of CUDA compute architectures to compile for (e.g., ``80;90`` for A100 and H100). If not set, automatically determined based on CUDA version. Setting this can significantly reduce build time and binary size.
 Compiling with FlashAttention
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^