index.rst 8.83 KB
Newer Older
Stanislav Pidhorskyi's avatar
Stanislav Pidhorskyi committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270

:github_url: https://github.com/facebookresearch/DRTK

Installation
===================================

Currently, we do not provide pre-compiled binaries for DRTK.
You will need to build the package from source. The current version of DRTK is |version|.

Prerequisites
^^^^^^^^^^^^^

Before installing DRTK, ensure you have the following prerequisites installed:

* PyTorch >= 2.1.0
* CUDA Toolkit

Additionally, we recommend installing the following packages to run tests and examples:

* torchvision
* opencv_python

Installing DRTK from GitHub using pip
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The easiest way to install DRTK is by using pip with the GitHub repository directly:

.. code-block:: shell

    # To install latest
    pip install git+https://github.com/facebookresearch/DRTK.git

.. code-block:: shell

    # To install stable
    pip install git+https://github.com/facebookresearch/DRTK.git@stable

.. warning::

    It may take significant amount of time to compile. The time could be 30 minutes or more.

This should be enough in most cases, given that PyTorch, CUDA Toolkit, and Build Essentials for your platform
are installed and the environment is correctly configured.


.. _Specifying Architectures:

Specifying Architectures
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you know the CUDA architecture of the device where the code will run, then it would be better to specify it directly, e.g.:

.. code-block:: shell

    # TORCH_CUDA_ARCH_LIST can use "named" architecture, see table below
    TORCH_CUDA_ARCH_LIST="Ampere" install git+https://github.com/facebookresearch/DRTK.git

or specify numerical values for architectures explicitly:

.. code-block:: shell

    # TORCH_CUDA_ARCH_LIST can combine several architectures separated with semicolon or space.
    # Add `+PTX` if you want also to save intermediate byte code for better compatibility.
    TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX" install git+https://github.com/facebookresearch/DRTK.git

If ``TORCH_CUDA_ARCH_LIST`` is not specified, DRTK will build for the following architectures by default: 7.2, 7.5, 8.0, 8.6, 9.0.

``TORCH_CUDA_ARCH_LIST`` can take one or more values from the supported named or numerical architectures list.
When combining values, use a semicolon `;` or space to combine numerical values and the `+` symbol to combine named values.

List of numerical architectures values supported by PyTorch: ``'3.5', '3.7', '5.0', '5.2', '5.3', '6.0', '6.1', '6.2', '7.0', '7.2', '7.5', '8.0', '8.6', '8.7', '8.9', '9.0', '9.0a'``.

The "named" architectures supported by PyTorch are listed in the table below:

.. list-table:: Named architectures
   :header-rows: 1

   * - Name
     - Arch
   * - Kepler+Tesla
     - 3.7
   * - Kepler
     - 3.5+PTX
   * - Maxwell+Tegra
     - 5.3
   * - Maxwell
     - 5.0;5.2+PTX
   * - Pascal
     - 6.0;6.1+PTX
   * - Volta+Tegra
     - 7.2
   * - Volta
     - 7.0+PTX
   * - Turing
     - 7.5+PTX
   * - Ampere+Tegra
     - 8.7
   * - Ampere
     - 8.0;8.6+PTX
   * - Ada
     - 8.9+PTX
   * - Hopper
     - 9.0+PTX

.. note::

    We do not test the DRTK package on architectures before Volta.

For more information about ``TORCH_CUDA_ARCH_LIST``, refer to the `PyTorch documentation <https://pytorch.org/docs/stable/cpp_extension.html#torch.utils.cpp_extension.CUDAExtension>`_ or view the `source code on GitHub <https://github.com/pytorch/pytorch/blob/c9653bf2ca6dd88b991d71abf836bd9a7a1d9dc3/torch/utils/cpp_extension.py#L1980>`_.

Installing from a Cloned Repository
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Alternatively, you can install the package from a local repository clone.
It could be helpful if you need to modify the package code.

Clone the repository and ``cd`` into it:

.. code-block:: shell

    git clone https://github.com/facebookresearch/DRTK
    cd DRTK

Then, install the package from the path using ``pip``. Note the ``--no-build-isolation`` flag, it is needed for modern build
system to disable building in a separate, clean Python environment.
The reason is that it will install a default ``torch`` version from pip, which likely will not match the one already installed in the system (due to usage of ``--index-url``).

.. code-block:: shell

    pip install . --no-build-isolation


Building and installing a wheel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To build a wheel run:

.. code-block:: shell

    # You might need to install `build` first
    # pip install --upgrade build
    python -m build --wheel --no-isolation

Alternatively, you can use the deprecated CLI of ``setuptools``:

.. code-block:: shell

    # You might need to install `wheel` first, though newer versions of setuptools do not require it anymore.
    # pip install --upgrade wheel
    python setup.py bdist_wheel

Then, you will find a wheel in the ``dist/`` folder. You can install this wheel by running:

.. code-block:: shell

    pip install dist/drtk-<tags>.whl

where ``<tags>`` are compatibility tags. You can figure them out by listing the ``dist/`` directory. E.g.:

.. code-block:: shell

    pip install dist/drtk-0.1.0-cp310-cp310-linux_x86_64.whl

Reinstalling the Wheel
^^^^^^^^^^^^^^^^^^^^^^

If you have already installed the package using ``pip``, it will not reinstall the package unless the version number has been incremented.
This behavior can be problematic if you are modifying the package locally and need to reinstall it.

To force a reinstall, add the following arguments: ``--upgrade --force-reinstall --no-deps``. For example:

.. code-block:: shell

    pip install --upgrade --force-reinstall --no-deps .


In place build
^^^^^^^^^^^^^^^^^^^

For package development, it can be beneficial to do an inplace build with:

.. code-block:: shell

    # There can be issues with concurrent build jobs, it is safer to specify `-inplace -j 1`
    python setup.py build_ext --inplace -j 1

Then you can use the root of the cloned repository as a working directory, and you should be able to do ``import drtk`` and run tests and examples.


Troubleshooting
================

1. CUDA Error: No Kernel Image Available
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**Example Error Message:**

    RuntimeError: CUDA error: no kernel image is available for execution on the device


**Cause:** This error occurs when the CUDA code was not built for the architecture of the device on which the code is running.

**Solution:** Specify the correct architecture using the ``TORCH_CUDA_ARCH_LIST`` environment variable when building the package. Refer to the examples in the :ref:`Specifying Architectures` section above.

2. Import Error: ``*.so`` Not Found
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**Example Error Message:**

    ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory

**Cause:** This issue is likely due to build isolation. Since DRTK currently does not distribute pre-compiled binaries, it is hard to get version mismatch otherwise.

**Solution:** Ensure you include the ``--no-build-isolation`` flag when installing from a local clone to use the correct CUDA and PyTorch libraries from your current environment:

.. code-block:: shell

    python -m build --wheel --no-isolation

and

.. code-block:: shell

    pip install . --no-build-isolation


3. Compilation Errors in CUDA or PyTorch headers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**Example Error Message:**

    error: no suitable conversion function from "const __half" to "unsigned short" exists

**Cause:** This error typically indicates a compiler version mismatch. It is likely that your C++ or CUDA compiler version is too old to support some of the features.

**Solution:** Please consult PyTorch and CUDA documentation to figure out what CUDA version is supported by your PyTorch version and what C++ compiler version is needed.


4. C++ Compilation Errors in PyTorch header aten/src/ATen/core/boxing/impl/boxing.h
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

**Example Error Message:**

    aten/src/ATen/core/boxing/impl/boxing.h:41:105: error: expected primary-expression before ‘>’ token


**Cause:** This issue is related to problematic SFINAE logic in template code. It has been observed in some recent versions of PyTorch.

**Solution:** One recommended solution is to add ``-std=c++20`` to the **nvcc** arguments, as suggested in `this GitHub issue <https://github.com/pytorch/pytorch/issues/122169>`_.
This line has already been added to ``setup.py```:

.. code-block:: python

    nvcc_args.append("-std=c++20")


.. warning::

    If you are using an older version of the CUDA Toolkit, adding this flag might result in an error:

    .. code-block:: shell

        unrecognized command line option '-std=c++20'


    In this case, you may need to remove the ``-std=c++20`` flag from ``setup.py``, or update your CUDA Toolkit to a version that supports C++20.

.. note::

    According to the comments in the `GitHub issue <https://github.com/pytorch/pytorch/issues/122169>`_, this was fixed in PyTorch v2.4.0 release.