Refactor build system (#235)

* Refactor Setuptools build system Successfully launches CMake install, but installs CMake extensions in temp dir. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug JAX build Fix pybind11 import. Distinguish between build-time and run-time dependencies. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add helper function to determine dependencies Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug case where system CMake is too old Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Simplify sanity import tests Just importing modules provides richer error messages. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install submodules Signed-off-by: Tim Moon <tmoon@nvidia.com> * Install helper library for TensorFlow Signed-off-by: Tim Moon <tmoon@nvidia.com> * Update documentation Signed-off-by: Tim Moon <tmoon@nvidia.com> * Do not install Ninja by default Signed-off-by: Tim Moon <tmoon@nvidia.com> * Include Git commit hash in version string Signed-off-by: Tim Moon <tmoon@nvidia.com> * Override build_ext.build_extensions instead of build_ext.run Signed-off-by: Tim Moon <tmoon@nvidia.com> * Fix incorrect include path Restore Ninja dependency. Restore overriding build_ext.run func. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Review suggestions from @nouiz Signed-off-by: Tim Moon <tmoon@nvidia.com> * Disable parallel Ninja jobs in GitHub actions Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install userbuffers lib Signed-off-by: Tim Moon <tmoon@nvidia.com> * Tweak install docs Review suggestion from @ksivaman Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add examples for specifying framework in docs Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com>

Refactor build system (#235)
* Refactor Setuptools build system Successfully launches CMake install, but installs CMake extensions in temp dir. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug JAX build Fix pybind11 import. Distinguish between build-time and run-time dependencies. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add helper function to determine dependencies Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug case where system CMake is too old Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add missing license Signed-off-by: Tim Moon <tmoon@nvidia.com> * Simplify sanity import tests Just importing modules provides richer error messages. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install submodules Signed-off-by: Tim Moon <tmoon@nvidia.com> * Install helper library for TensorFlow Signed-off-by: Tim Moon <tmoon@nvidia.com> * Update documentation Signed-off-by: Tim Moon <tmoon@nvidia.com> * Do not install Ninja by default Signed-off-by: Tim Moon <tmoon@nvidia.com> * Include Git commit hash in version string Signed-off-by: Tim Moon <tmoon@nvidia.com> * Override build_ext.build_extensions instead of build_ext.run Signed-off-by: Tim Moon <tmoon@nvidia.com> * Fix incorrect include path Restore Ninja dependency. Restore overriding build_ext.run func. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Review suggestions from @nouiz Signed-off-by: Tim Moon <tmoon@nvidia.com> * Disable parallel Ninja jobs in GitHub actions Signed-off-by: Tim Moon <tmoon@nvidia.com> * Properly install userbuffers lib Signed-off-by: Tim Moon <tmoon@nvidia.com> * Tweak install docs Review suggestion from @ksivaman Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add examples for specifying framework in docs Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com>
37bbfc76 · Tim Moon · GitHub · 215dfe7e · 37bbfc76 · 37bbfc76
Unverified Commit 37bbfc76 authored May 31, 2023 by Tim Moon Committed by GitHub May 31, 2023
12 changed files
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -22,7 +22,7 @@ jobs:
      - name: 'Build'
        run: |
          mkdir -p wheelhouse && \
-          NVTE_FRAMEWORK=pytorch pip wheel -w wheelhouse . -v
+          NVTE_FRAMEWORK=pytorch MAX_JOBS=1 pip wheel -w wheelhouse . -v
      - name: 'Upload wheel'
        uses: actions/upload-artifact@v3
        with:
@@ -47,7 +47,6 @@ jobs:
          submodules: recursive
      - name: 'Build'
        run: |
-          pip install ninja pybind11 && \
          pip install --upgrade "jax[cuda12_local]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html && \
          mkdir -p wheelhouse && \
          NVTE_FRAMEWORK=jax pip wheel -w wheelhouse . -v
@@ -74,7 +73,6 @@ jobs:
          submodules: recursive
      - name: 'Build'
        run: |
-          pip install ninja pybind11 && \
          mkdir -p wheelhouse && \
          NVTE_FRAMEWORK=tensorflow pip wheel -w wheelhouse . -v
      - name: 'Upload wheel'

--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -34,12 +34,9 @@ pip - from GitHub
 Additional Prerequisites
 ^^^^^^^^^^^^^^^^^^^^^^^^

-1. `CMake <https://cmake.org/>`__ version 3.18 or later: `pip install cmake`.
-2. [For pyTorch support] `pyTorch <https://pytorch.org/>`__ with GPU support.
-3. [For JAX support] `JAX <https://github.com/google/jax/>`__ with GPU support, version >= 0.4.7.
-4. [For TensorFlow support] `TensorFlow <https://www.tensorflow.org/>`__ with GPU support.
-5. `pybind11`: `pip install pybind11`.
-6. [Optional] `Ninja <https://ninja-build.org/>`__: `pip install ninja`.
+1. [For PyTorch support] `PyTorch <https://pytorch.org/>`__ with GPU support.
+2. [For JAX support] `JAX <https://github.com/google/jax/>`__ with GPU support, version >= 0.4.7.
+3. [For TensorFlow support] `TensorFlow <https://www.tensorflow.org/>`__ with GPU support.

 Installation (stable release)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -48,11 +45,9 @@ Execute the following command to install the latest stable version of Transforme

 .. code-block:: bash

-  # Execute one of the following commands
-  NVTE_FRAMEWORK=pytorch pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable    # Build TE for PyTorch only. The default.
-  NVTE_FRAMEWORK=jax pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable        # Build TE for JAX only.
-  NVTE_FRAMEWORK=tensorflow pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable # Build TE for TensorFlow only.
-  NVTE_FRAMEWORK=all pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable        # Build TE for all supported frameworks.
+  pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
+
+This will automatically detect if any supported deep learning frameworks are installed and build Transformer Engine support for them. To explicitly specify frameworks, set the environment variable `NVTE_FRAMEWORK` to a comma-separated list (e.g. `NVTE_FRAMEWORK=jax,tensorflow`).

 Installation (development build)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -67,12 +62,10 @@ Execute the following command to install the latest development build of Transfo

 .. code-block:: bash

-  # Execute one of the following commands
-  NVTE_FRAMEWORK=pytorch pip install git+https://github.com/NVIDIA/TransformerEngine.git@main    # Build TE for PyTorch only. The default.
-  NVTE_FRAMEWORK=jax pip install git+https://github.com/NVIDIA/TransformerEngine.git@main        # Build TE for JAX only.
-  NVTE_FRAMEWORK=tensorflow pip install git+https://github.com/NVIDIA/TransformerEngine.git@main # Build TE for TensorFlow only.
-  NVTE_FRAMEWORK=all pip install git+https://github.com/NVIDIA/TransformerEngine.git@main        # Build TE for all supported frameworks.
-  
+  pip install git+https://github.com/NVIDIA/TransformerEngine.git@main
+
+This will automatically detect if any supported deep learning frameworks are installed and build Transformer Engine support for them. To explicitly specify frameworks, set the environment variable `NVTE_FRAMEWORK` to a comma-separated list (e.g. `NVTE_FRAMEWORK=jax,tensorflow`).
+
 Installation (from source)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@@ -80,14 +73,27 @@ Execute the following commands to install Transformer Engine from source:

 .. code-block:: bash

-  git clone --recursive https://github.com/NVIDIA/TransformerEngine.git    # Clone the repository/fork and checkout all submodules recursively.
-  cd TransformerEngine                                                     # Enter TE directory.
-  git checkout stable                                                      # Checkout the correct branch.
-  export NVTE_FRAMEWORK=pytorch                                            # Optionally set the framework.
-  pip install .                                                            # Build and install
+  # Clone repository, checkout stable branch, clone submodules
+  git clone --branch stable --recursive https://github.com/NVIDIA/TransformerEngine.git
+
+  cd TransformerEngine
+  export NVTE_FRAMEWORK=pytorch   # Optionally set framework
+  pip install .                   # Build and install
+
+If the Git repository has already been cloned, make sure to also clone the submodules:
+
+.. code-block:: bash
+
+  git submodule update --init --recursive
+
+Extra dependencies for testing can be installed by setting the "test" option:
+
+.. code-block:: bash
+
+  pip install .[test]

-For already cloned repos, run the following command in TE directory:
+To build the C++ extensions with debug symbols, e.g. with the `-g` flag:

 .. code-block:: bash

-  git submodule update --init --recursive                                   # Checkout all submodules recursively.
+  pip install . --global-option=--debug
--- a/setup.py
+++ b/setup.py
--- a/tests/jax/test_sanity_import.py
+++ b/tests/jax/test_sanity_import.py
@@ -2,11 +2,5 @@
 #
 # See LICENSE for license information.

-try:
-    import transformer_engine.jax
-    te_imported = True
-except:
-    te_imported = False
-
-assert te_imported, 'transformer_engine import failed'
+import transformer_engine.jax
 print("OK")
--- a/tests/pytorch/test_sanity_import.py
+++ b/tests/pytorch/test_sanity_import.py
@@ -2,11 +2,5 @@
 #
 # See LICENSE for license information.

-try:
-    import transformer_engine.pytorch
-    te_imported = True
-except:
-    te_imported = False
-
-assert te_imported, 'transformer_engine import failed'
+import transformer_engine.pytorch
 print("OK")
--- a/tests/tensorflow/test_sanity_import.py
+++ b/tests/tensorflow/test_sanity_import.py
@@ -2,11 +2,5 @@
 #
 # See LICENSE for license information.

-try:
-    import transformer_engine.tensorflow
-    te_imported = True
-except:
-    te_imported = False
-
-assert te_imported, 'transformer_engine import failed'
+import transformer_engine.tensorflow
 print("OK")
--- a/transformer_engine/CMakeLists.txt
+++ b/transformer_engine/CMakeLists.txt
@@ -28,16 +28,20 @@ include_directories(${PROJECT_SOURCE_DIR})

 add_subdirectory(common)
 if(NVTE_WITH_USERBUFFERS)
+    message(STATUS "userbuffers support enabled")
    add_subdirectory(pytorch/csrc/userbuffers)
 endif()

+
 option(ENABLE_JAX "Enable JAX in the building workflow." OFF)
+message(STATUS "JAX support: ${ENABLE_JAX}")
 if(ENABLE_JAX)
  find_package(pybind11 CONFIG REQUIRED)
  add_subdirectory(jax)
 endif()

 option(ENABLE_TENSORFLOW "Enable TensorFlow in the building workflow." OFF)
+message(STATUS "TensorFlow support: ${ENABLE_TENSORFLOW}")
 if(ENABLE_TENSORFLOW)
  find_package(pybind11 CONFIG REQUIRED)
  add_subdirectory(tensorflow)

--- a/transformer_engine/cmake/FindCUDNN.cmake
+++ b/transformer_engine/cmake/FindCUDNN.cmake
+# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+#
+# See LICENSE for license information.
+
 add_library(CUDNN::cudnn_all INTERFACE IMPORTED)

 find_path(
@@ -14,7 +18,7 @@ function(find_cudnn_library NAME)
        HINTS $ENV{CUDNN_PATH} ${CUDNN_PATH} ${CUDAToolkit_LIBRARY_DIR}
        PATH_SUFFIXES lib64 lib/x64 lib
    )
-    
+
    if(${UPPERCASE_NAME}_LIBRARY)
        add_library(CUDNN::${NAME} UNKNOWN IMPORTED)
        set_target_properties(
@@ -48,7 +52,7 @@ if(CUDNN_INCLUDE_DIR AND CUDNN_LIBRARY)

    message(STATUS "cuDNN: ${CUDNN_LIBRARY}")
    message(STATUS "cuDNN: ${CUDNN_INCLUDE_DIR}")
-    
+
    set(CUDNN_FOUND ON CACHE INTERNAL "cuDNN Library Found")

 else()
@@ -73,6 +77,5 @@ target_link_libraries(
    CUDNN::cudnn_adv_infer
    CUDNN::cudnn_cnn_infer
    CUDNN::cudnn_ops_infer
-    CUDNN::cudnn 
+    CUDNN::cudnn
 )
-
--- a/transformer_engine/common/CMakeLists.txt
+++ b/transformer_engine/common/CMakeLists.txt
@@ -77,3 +77,6 @@ set_source_files_properties(fused_softmax/scaled_masked_softmax.cu
                            COMPILE_OPTIONS "--use_fast_math")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3")
+
+# Install library
+install(TARGETS transformer_engine DESTINATION .)
--- a/transformer_engine/jax/CMakeLists.txt
+++ b/transformer_engine/jax/CMakeLists.txt
@@ -10,3 +10,4 @@ pybind11_add_module(
 )

 target_link_libraries(transformer_engine_jax PRIVATE CUDA::cudart CUDA::cublas CUDA::cublasLt transformer_engine)
+install(TARGETS transformer_engine_jax DESTINATION .)
--- a/transformer_engine/pytorch/csrc/userbuffers/CMakeLists.txt
+++ b/transformer_engine/pytorch/csrc/userbuffers/CMakeLists.txt
@@ -31,3 +31,6 @@ set_source_files_properties(userbuffers.cu
                            COMPILE_OPTIONS "$<$<COMPILE_LANGUAGE:CUDA>:-maxrregcount=64>")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3")
+
+# Install library
+install(TARGETS transformer_engine_userbuffers DESTINATION .)
--- a/transformer_engine/tensorflow/CMakeLists.txt
+++ b/transformer_engine/tensorflow/CMakeLists.txt
@@ -13,9 +13,9 @@ add_library(
 )

 # Includes
-execute_process(COMMAND ${Python_EXECUTABLE} -c "import tensorflow as tf; print(tf.sysconfig.get_include())" 
+execute_process(COMMAND ${Python_EXECUTABLE} -c "import tensorflow as tf; print(tf.sysconfig.get_include())"
                OUTPUT_VARIABLE Tensorflow_INCLUDE_DIRS OUTPUT_STRIP_TRAILING_WHITESPACE)
-execute_process(COMMAND ${Python_EXECUTABLE} -c "import numpy as np; print(np.get_include())" 
+execute_process(COMMAND ${Python_EXECUTABLE} -c "import numpy as np; print(np.get_include())"
                OUTPUT_VARIABLE Numpy_INCLUDE_DIRS OUTPUT_STRIP_TRAILING_WHITESPACE)

 target_include_directories(transformer_engine_tensorflow PRIVATE
@@ -25,7 +25,7 @@ target_include_directories(transformer_engine_tensorflow PRIVATE
 target_include_directories(_get_stream PRIVATE ${Tensorflow_INCLUDE_DIRS})

 # Libraries
-execute_process(COMMAND ${Python_EXECUTABLE} -c "import tensorflow as tf; print(tf.__file__)" 
+execute_process(COMMAND ${Python_EXECUTABLE} -c "import tensorflow as tf; print(tf.__file__)"
                OUTPUT_VARIABLE Tensorflow_LIB_PATH OUTPUT_STRIP_TRAILING_WHITESPACE)
 get_filename_component(Tensorflow_LIB_PATH ${Tensorflow_LIB_PATH} DIRECTORY)
 list(APPEND TF_LINKER_LIBS "${Tensorflow_LIB_PATH}/libtensorflow_framework.so.2")
@@ -40,3 +40,7 @@ target_link_libraries(_get_stream PRIVATE ${TF_LINKER_LIBS})

 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr")
 set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3")
+
+# Install library
+install(TARGETS transformer_engine_tensorflow DESTINATION .)
+install(TARGETS _get_stream DESTINATION .)