Unverified Commit e411547b authored by Paweł Gadziński's avatar Paweł Gadziński Committed by GitHub
Browse files

[PyTorch Debug] Add nvdlfw-inspect to dependencies (#2173)



* code drop
Signed-off-by: default avatarPawel Gadzinski <pgadzinski@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci



---------
Signed-off-by: default avatarPawel Gadzinski <pgadzinski@nvidia.com>
Signed-off-by: default avatarPaweł Gadziński <62263673+pggPL@users.noreply.github.com>
Co-authored-by: default avatarpre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
parent 5afbb0e1
...@@ -14,12 +14,19 @@ from typing import List ...@@ -14,12 +14,19 @@ from typing import List
def install_requirements() -> List[str]: def install_requirements() -> List[str]:
"""Install dependencies for TE/PyTorch extensions.""" """Install dependencies for TE/PyTorch extensions."""
return ["torch>=2.1", "einops", "onnxscript", "onnx", "packaging", "pydantic"] return ["torch>=2.1", "einops", "onnxscript", "onnx", "packaging", "pydantic", "nvdlfw-inspect"]
def test_requirements() -> List[str]: def test_requirements() -> List[str]:
"""Test dependencies for TE/JAX extensions.""" """Test dependencies for TE/PyTorch extensions."""
return ["numpy", "torchvision", "transformers", "torchao==0.13"] return [
"numpy",
"torchvision",
"transformers",
"torchao==0.13",
"onnxruntime",
"onnxruntime_extensions",
]
def setup_pytorch_extension( def setup_pytorch_extension(
......
...@@ -21,7 +21,7 @@ Transformer Engine provides a set of precision debug tools which allow you to ea ...@@ -21,7 +21,7 @@ Transformer Engine provides a set of precision debug tools which allow you to ea
There are 4 things one needs to do to use Transformer Engine debug features: There are 4 things one needs to do to use Transformer Engine debug features:
1. Create a configuration YAML file to configure the desired features. 1. Create a configuration YAML file to configure the desired features.
2. Import, initialize, and install the `Nvidia-DL-Framework-Inspect <https://github.com/NVIDIA/nvidia-dlfw-inspect>`_ tool. 2. Import and initialize the `Nvidia-DL-Framework-Inspect <https://github.com/NVIDIA/nvidia-dlfw-inspect>`_ tool, which is installed as a dependency of Transformer Engine.
3. One can pass ``name="..."`` when creating TE layers to easier identify layer names. If this is not provided, names will be inferred automatically. 3. One can pass ``name="..."`` when creating TE layers to easier identify layer names. If this is not provided, names will be inferred automatically.
4. Invoke ``debug_api.step()`` at the end of one forward-backward pass. 4. Invoke ``debug_api.step()`` at the end of one forward-backward pass.
......
...@@ -26,11 +26,6 @@ mkdir -p "$XML_LOG_DIR" ...@@ -26,11 +26,6 @@ mkdir -p "$XML_LOG_DIR"
# Nvinspect will be disabled if no feature is active. # Nvinspect will be disabled if no feature is active.
: ${NVTE_TEST_NVINSPECT_DUMMY_CONFIG_FILE:=$TE_PATH/tests/pytorch/debug/test_configs/dummy_feature.yaml} : ${NVTE_TEST_NVINSPECT_DUMMY_CONFIG_FILE:=$TE_PATH/tests/pytorch/debug/test_configs/dummy_feature.yaml}
# It is not installed as a requirement,
# because it is not available on PyPI.
pip uninstall -y nvdlfw-inspect
pip install git+https://github.com/NVIDIA/nvidia-dlfw-inspect.git
pip install pytest==8.2.1 || error_exit "Failed to install pytest" pip install pytest==8.2.1 || error_exit "Failed to install pytest"
pytest -v -s --junitxml=$XML_LOG_DIR/test_sanity.xml $TE_PATH/tests/pytorch/debug/test_sanity.py --feature_dirs=$NVTE_TEST_NVINSPECT_FEATURE_DIRS || test_fail "test_sanity.py" pytest -v -s --junitxml=$XML_LOG_DIR/test_sanity.xml $TE_PATH/tests/pytorch/debug/test_sanity.py --feature_dirs=$NVTE_TEST_NVINSPECT_FEATURE_DIRS || test_fail "test_sanity.py"
......
...@@ -20,12 +20,6 @@ FAILED_CASES="" ...@@ -20,12 +20,6 @@ FAILED_CASES=""
: ${XML_LOG_DIR:=/logs} : ${XML_LOG_DIR:=/logs}
mkdir -p "$XML_LOG_DIR" mkdir -p "$XML_LOG_DIR"
# It is not installed as a requirement,
# because it is not available on PyPI.
pip uninstall -y nvdlfw-inspect
pip install git+https://github.com/NVIDIA/nvidia-dlfw-inspect.git
pip3 install pytest==8.2.1 || error_exit "Failed to install pytest" pip3 install pytest==8.2.1 || error_exit "Failed to install pytest"
python3 -m pytest -v -s --junitxml=$XML_LOG_DIR/pytest_test_sanity.xml $TE_PATH/tests/pytorch/distributed/test_sanity.py || test_fail "test_sanity.py" python3 -m pytest -v -s --junitxml=$XML_LOG_DIR/pytest_test_sanity.xml $TE_PATH/tests/pytorch/distributed/test_sanity.py || test_fail "test_sanity.py"
......
...@@ -2,10 +2,6 @@ ...@@ -2,10 +2,6 @@
# #
# See LICENSE for license information. # See LICENSE for license information.
pip3 install onnxruntime
pip3 install onnxruntime_extensions
: ${TE_PATH:=/opt/transformerengine} : ${TE_PATH:=/opt/transformerengine}
: ${XML_LOG_DIR:=/logs} : ${XML_LOG_DIR:=/logs}
mkdir -p "$XML_LOG_DIR" mkdir -p "$XML_LOG_DIR"
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment