Unverified Commit c1a0c974 authored by jberchtold-nvidia's avatar jberchtold-nvidia Committed by GitHub
Browse files
parent dccf67e7
......@@ -315,6 +315,37 @@ Troubleshooting
cd transformer_engine
pip install -v -v -v --no-build-isolation .
**Problems using UV or Virtual Environments:**
1. **Import Error:**
* **Symptoms:** Cannot import ``transformer_engine``
* **Solution:** Ensure your UV environment is active and that you have used ``uv pip install --no-build-isolation <te_pypi_package_or_wheel_or_source_dir>`` instead of a regular pip install to your system environment.
2. **cuDNN Sublibrary Loading Failed:**
* **Symptoms:** Errors at runtime with ``CUDNN_STATUS_SUBLIBRARY_LOADING_FAILED``
* **Solution:** This can occur when TE is built against the container's system installation of cuDNN, but pip packages inside the virtual environment pull in pip packages for ``nvidia-cudnn-cu12/cu13``. To resolve this, when building TE from source please specify the following environment variables to point to the cuDNN in your virtual environment.
.. code-block:: bash
export CUDNN_PATH=$(pwd)/.venv/lib/python3.12/site-packages/nvidia/cudnn
export CUDNN_HOME=$CUDNN_PATH
export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$LD_LIBRARY_PATH
3. **Building Wheels:**
* **Symptoms:** Regular TE installs work correctly but UV wheel builds fail at runtime.
* **Solution:** Ensure that ``uv build --wheel --no-build-isolation -v`` is used during the wheel build as well as the pip installation of the wheel. Use ``-v`` for verbose output to verify that TE is not pulling in a mismatching version of PyTorch or JAX that differs from the UV environment's version.
**JAX-specific Common Issues and Solutions:**
1. **FFI Issues:**
* **Symptoms:** ``No registered implementation for custom call to <some_te_ffi> for platform CUDA``
* **Solution:** Ensure ``--no-build-isolation`` is used during installation. If pre-building wheels, ensure that the wheel is both built and installed with ``--no-build-isolation``. See "Problems using UV or Virtual Environments" above if using UV.
.. troubleshooting-end-marker-do-not-remove
Breaking Changes
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment