Refactor build system (#235)

* Refactor Setuptools build system Successfully launches CMake install, but installs CMake extensions in temp dir. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug JAX build Fix pybind11 import. Distinguish between build-time and run-time dependencies. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add helper function to determine dependencies Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug case where system CMake is too old Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Simplify sanity import tests Just importing modules provides richer error messages. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install submodules Signed-off-by: Tim Moon <tmoon@nvidia.com> * Install helper library for TensorFlow Signed-off-by: Tim Moon <tmoon@nvidia.com> * Update documentation Signed-off-by: Tim Moon <tmoon@nvidia.com> * Do not install Ninja by default Signed-off-by: Tim Moon <tmoon@nvidia.com> * Include Git commit hash in version string Signed-off-by: Tim Moon <tmoon@nvidia.com> * Override build_ext.build_extensions instead of build_ext.run Signed-off-by: Tim Moon <tmoon@nvidia.com> * Fix incorrect include path Restore Ninja dependency. Restore overriding build_ext.run func. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Review suggestions from @nouiz Signed-off-by: Tim Moon <tmoon@nvidia.com> * Disable parallel Ninja jobs in GitHub actions Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install userbuffers lib Signed-off-by: Tim Moon <tmoon@nvidia.com> * Tweak install docs Review suggestion from @ksivaman Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add examples for specifying framework in docs Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com>

Refactor build system (#235)
* Refactor Setuptools build system Successfully launches CMake install, but installs CMake extensions in temp dir. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug JAX build Fix pybind11 import. Distinguish between build-time and run-time dependencies. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add helper function to determine dependencies Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug case where system CMake is too old Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Simplify sanity import tests Just importing modules provides richer error messages. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install submodules Signed-off-by: Tim Moon <tmoon@nvidia.com> * Install helper library for TensorFlow Signed-off-by: Tim Moon <tmoon@nvidia.com> * Update documentation Signed-off-by: Tim Moon <tmoon@nvidia.com> * Do not install Ninja by default Signed-off-by: Tim Moon <tmoon@nvidia.com> * Include Git commit hash in version string Signed-off-by: Tim Moon <tmoon@nvidia.com> * Override build_ext.build_extensions instead of build_ext.run Signed-off-by: Tim Moon <tmoon@nvidia.com> * Fix incorrect include path Restore Ninja dependency. Restore overriding build_ext.run func. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Review suggestions from @nouiz Signed-off-by: Tim Moon <tmoon@nvidia.com> * Disable parallel Ninja jobs in GitHub actions Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install userbuffers lib Signed-off-by: Tim Moon <tmoon@nvidia.com> * Tweak install docs Review suggestion from @ksivaman Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add examples for specifying framework in docs Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com>
37bbfc76 · Tim Moon · GitHub · 215dfe7e · 37bbfc76 · 37bbfc76
Unverified Commit 37bbfc76 authored May 31, 2023 by Tim Moon Committed by GitHub May 31, 2023
12 changed files
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -22,7 +22,7 @@ jobs:
      - name: 'Build'
        run: |
          mkdir -p wheelhouse && \
-          NVTE_FRAMEWORK=pytorch pip wheel -w wheelhouse . -v
+          NVTE_FRAMEWORK=pytorch MAX_JOBS=1 pip wheel -w wheelhouse . -v
      - name: 'Upload wheel'
        uses: actions/upload-artifact@v3
        with:
@@ -47,7 +47,6 @@ jobs:
          submodules: recursive
      - name: 'Build'
        run: |
-          pip install ninja pybind11 && \
          pip install --upgrade "jax[cuda12_local]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html && \
          mkdir -p wheelhouse && \
          NVTE_FRAMEWORK=jax pip wheel -w wheelhouse . -v
@@ -74,7 +73,6 @@ jobs:
          submodules: recursive
      - name: 'Build'
        run: |
-          pip install ninja pybind11 && \
          mkdir -p wheelhouse && \
          NVTE_FRAMEWORK=tensorflow pip wheel -w wheelhouse . -v
      - name: 'Upload wheel'

--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -34,12 +34,9 @@ pip - from GitHub
 Additional Prerequisites
 ^^^^^^^^^^^^^^^^^^^^^^^^
-1. `CMake <https://cmake.org/>`__ version 3.18 or later: `pip install cmake`.
+1. [For PyTorch support] `PyTorch <https://pytorch.org/>`__ with GPU support.
-2. [For pyTorch support] `pyTorch <https://pytorch.org/>`__ with GPU support.
+2. [For JAX support] `JAX <https://github.com/google/jax/>`__ with GPU support, version >= 0.4.7.
-3. [For JAX support] `JAX <https://github.com/google/jax/>`__ with GPU support, version >= 0.4.7.
+3. [For TensorFlow support] `TensorFlow <https://www.tensorflow.org/>`__ with GPU support.
-4. [For TensorFlow support] `TensorFlow <https://www.tensorflow.org/>`__ with GPU support.
-5. `pybind11`: `pip install pybind11`.
-6. [Optional] `Ninja <https://ninja-build.org/>`__: `pip install ninja`.
 Installation (stable release)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -48,11 +45,9 @@ Execute the following command to install the latest stable version of Transforme
 .. code-block:: bash
-  # Execute one of the following commands
+  pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
-  NVTE_FRAMEWORK=pytorch pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable    # Build TE for PyTorch only. The default.
-  NVTE_FRAMEWORK=jax pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable        # Build TE for JAX only.
+This will automatically detect if any supported deep learning frameworks are installed and build Transformer Engine support for them. To explicitly specify frameworks, set the environment variable `NVTE_FRAMEWORK` to a comma-separated list (e.g. `NVTE_FRAMEWORK=jax,tensorflow`).
-  NVTE_FRAMEWORK=tensorflow pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable # Build TE for TensorFlow only.
-  NVTE_FRAMEWORK=all pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable        # Build TE for all supported frameworks.
 Installation (development build)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -67,11 +62,9 @@ Execute the following command to install the latest development build of Transfo
 .. code-block:: bash
-  # Execute one of the following commands
+  pip install git+https://github.com/NVIDIA/TransformerEngine.git@main
-  NVTE_FRAMEWORK=pytorch pip install git+https://github.com/NVIDIA/TransformerEngine.git@main    # Build TE for PyTorch only. The default.
-  NVTE_FRAMEWORK=jax pip install git+https://github.com/NVIDIA/TransformerEngine.git@main        # Build TE for JAX only.
+This will automatically detect if any supported deep learning frameworks are installed and build Transformer Engine support for them. To explicitly specify frameworks, set the environment variable `NVTE_FRAMEWORK` to a comma-separated list (e.g. `NVTE_FRAMEWORK=jax,tensorflow`).
-  NVTE_FRAMEWORK=tensorflow pip install git+https://github.com/NVIDIA/TransformerEngine.git@main # Build TE for TensorFlow only.
-  NVTE_FRAMEWORK=all pip install git+https://github.com/NVIDIA/TransformerEngine.git@main        # Build TE for all supported frameworks.
 Installation (from source)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -80,14 +73,27 @@ Execute the following commands to install Transformer Engine from source:
 .. code-block:: bash
-  git clone --recursive https://github.com/NVIDIA/TransformerEngine.git    # Clone the repository/fork and checkout all submodules recursively.
+  # Clone repository, checkout stable branch, clone submodules
-  cd TransformerEngine                                                     # Enter TE directory.
+  git clone --branch stable --recursive https://github.com/NVIDIA/TransformerEngine.git
-  git checkout stable                                                      # Checkout the correct branch.
-  export NVTE_FRAMEWORK=pytorch                                            # Optionally set the framework.
+  cd TransformerEngine
+  export NVTE_FRAMEWORK=pytorch   # Optionally set framework
  pip install .                   # Build and install
-For already cloned repos, run the following command in TE directory:
+If the Git repository has already been cloned, make sure to also clone the submodules:
+.. code-block:: bash
+  git submodule update --init --recursive
+Extra dependencies for testing can be installed by setting the "test" option:
+.. code-block:: bash
+  pip install .[test]
+To build the C++ extensions with debug symbols, e.g. with the `-g` flag:
 .. code-block:: bash
-  git submodule update --init --recursive                                   # Checkout all submodules recursively.
+  pip install . --global-option=--debug
--- a/setup.py
+++ b/setup.py
--- a/tests/jax/test_sanity_import.py
+++ b/tests/jax/test_sanity_import.py
@@ -2,11 +2,5 @@
 #
 # See LICENSE for license information.
-try:
+import transformer_engine.jax
-    import transformer_engine.jax
-    te_imported = True
-except:
-    te_imported = False
-assert te_imported, 'transformer_engine import failed'
 print("OK")
--- a/tests/pytorch/test_sanity_import.py
+++ b/tests/pytorch/test_sanity_import.py
@@ -2,11 +2,5 @@
 #
 # See LICENSE for license information.
-try:
+import transformer_engine.pytorch
-    import transformer_engine.pytorch
-    te_imported = True
-except:
-    te_imported = False
-assert te_imported, 'transformer_engine import failed'
 print("OK")
--- a/tests/tensorflow/test_sanity_import.py
+++ b/tests/tensorflow/test_sanity_import.py
@@ -2,11 +2,5 @@
 #
 # See LICENSE for license information.
-try:
+import transformer_engine.tensorflow
-    import transformer_engine.tensorflow
-    te_imported = True
-except:
-    te_imported = False
-assert te_imported, 'transformer_engine import failed'
 print("OK")
--- a/transformer_engine/CMakeLists.txt
+++ b/transformer_engine/CMakeLists.txt
@@ -28,16 +28,20 @@ include_directories(${PROJECT_SOURCE_DIR})
 add_subdirectory(common)
 if(NVTE_WITH_USERBUFFERS)
+    message(STATUS "userbuffers support enabled")
    add_subdirectory(pytorch/csrc/userbuffers)
 endif()
 option(ENABLE_JAX "Enable JAX in the building workflow." OFF)
+message(STATUS "JAX support: ${ENABLE_JAX}")
 if(ENABLE_JAX)
  find_package(pybind11 CONFIG REQUIRED)
  add_subdirectory(jax)
 endif()
 option(ENABLE_TENSORFLOW "Enable TensorFlow in the building workflow." OFF)
+message(STATUS "TensorFlow support: ${ENABLE_TENSORFLOW}")
 if(ENABLE_TENSORFLOW)
  find_package(pybind11 CONFIG REQUIRED)
  add_subdirectory(tensorflow)

--- a/transformer_engine/cmake/FindCUDNN.cmake
+++ b/transformer_engine/cmake/FindCUDNN.cmake
+# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+#
+# See LICENSE for license information.
 add_library(CUDNN::cudnn_all INTERFACE IMPORTED)
 find_path(
@@ -75,4 +79,3 @@ target_link_libraries(
    CUDNN::cudnn_ops_infer
    CUDNN::cudnn
 )
--- a/transformer_engine/common/CMakeLists.txt
+++ b/transformer_engine/common/CMakeLists.txt
@@ -77,3 +77,6 @@ set_source_files_properties(fused_softmax/scaled_masked_softmax.cu
                            COMPILE_OPTIONS "--use_fast_math")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3")
+# Install library
+install(TARGETS transformer_engine DESTINATION .)
--- a/transformer_engine/jax/CMakeLists.txt
+++ b/transformer_engine/jax/CMakeLists.txt
@@ -10,3 +10,4 @@ pybind11_add_module(
 )
 target_link_libraries(transformer_engine_jax PRIVATE CUDA::cudart CUDA::cublas CUDA::cublasLt transformer_engine)
+install(TARGETS transformer_engine_jax DESTINATION .)
--- a/transformer_engine/pytorch/csrc/userbuffers/CMakeLists.txt
+++ b/transformer_engine/pytorch/csrc/userbuffers/CMakeLists.txt
@@ -31,3 +31,6 @@ set_source_files_properties(userbuffers.cu
                            COMPILE_OPTIONS "$<$<COMPILE_LANGUAGE:CUDA>:-maxrregcount=64>")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3")
+# Install library
+install(TARGETS transformer_engine_userbuffers DESTINATION .)
--- a/transformer_engine/tensorflow/CMakeLists.txt
+++ b/transformer_engine/tensorflow/CMakeLists.txt
@@ -40,3 +40,7 @@ target_link_libraries(_get_stream PRIVATE ${TF_LINKER_LIBS})
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3")
+# Install library
+install(TARGETS transformer_engine_tensorflow DESTINATION .)
+install(TARGETS _get_stream DESTINATION .)