"vscode:/vscode.git/clone" did not exist on "8aa301113fabe756280232df1d63a7b739a9c678"
GPU-Targets.rst 9.57 KB
Newer Older
1
2
3
GPU SDK Correspondence and Device Targeting Table
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

4
5
6
GPU Targets Table
=================

7
OpenCL is a universal massively parallel programming framework that targets to multiple backends (GPU, CPU, FPGA, etc).
8
Basically, to use a device from a vendor, you have to install drivers from that specific vendor.
Nikita Titov's avatar
Nikita Titov committed
9
10
Intel's and AMD's OpenCL runtime also include x86 CPU target support.
NVIDIA's OpenCL runtime only supports NVIDIA GPU (no CPU support).
11
In general, OpenCL CPU backends are quite slow, and should be used for testing and debugging only.
12
13
14
15
16
17

You can find below a table of correspondence:

+---------------------------+-----------------+-----------------+-----------------+--------------+
| SDK                       | CPU Intel/AMD   | GPU Intel       | GPU AMD         | GPU NVIDIA   |
+===========================+=================+=================+=================+==============+
18
| `Intel SDK for OpenCL`_   | Supported       | Supported       | Not Supported   | Not Supported|
19
+---------------------------+-----------------+-----------------+-----------------+--------------+
20
| AMD APP SDK \*            | Supported       | Not Supported   | Supported       | Not Supported|
21
+---------------------------+-----------------+-----------------+-----------------+--------------+
22
23
| `PoCL`_                   | Supported       | Not Supported   | Supported       | Not Supported|
+---------------------------+-----------------+-----------------+-----------------+--------------+
24
| `NVIDIA CUDA Toolkit`_    | Not Supported   | Not Supported   | Not Supported   | Supported    |
25
26
27
28
+---------------------------+-----------------+-----------------+-----------------+--------------+

Legend:

29
\* AMD APP SDK is deprecated. On Windows, OpenCL is included in AMD graphics driver. On Linux, newer generation AMD cards are supported by the `ROCm`_ driver. You can download an archived copy of AMD APP SDK from our GitHub repo (`for Linux`_ and `for Windows`_).
30
31
32
33


--------------

34
35
36
Query OpenCL Devices in Your System
===================================

37
Your system might have multiple GPUs from different vendors ("platforms") installed. Setting up LightGBM GPU device requires two parameters: `OpenCL Platform ID <./Parameters.rst#gpu_platform_id>`__ (``gpu_platform_id``) and `OpenCL Device ID <./Parameters.rst#gpu_device_id>`__ (``gpu_device_id``). Generally speaking, each vendor provides an OpenCL platform, and devices from the same vendor have different device IDs under that platform. For example, if your system has an Intel integrated GPU and two discrete GPUs from AMD, you will have two OpenCL platforms (with ``gpu_platform_id=0`` and ``gpu_platform_id=1``). If the platform 0 is Intel, it has one device (``gpu_device_id=0``) representing the Intel GPU; if the platform 1 is AMD, it has two devices (``gpu_device_id=0``, ``gpu_device_id=1``) representing the two AMD GPUs. If you have a discrete GPU by AMD/NVIDIA and an integrated GPU by Intel, make sure to select the correct ``gpu_platform_id`` to use the discrete GPU as it usually provides better performance.
38

Nikita Titov's avatar
Nikita Titov committed
39
On Windows, OpenCL devices can be queried using `GPUCapsViewer`_, under the OpenCL tab. Note that the platform and device IDs reported by this utility start from 1. So you should minus the reported IDs by 1.
40

41
42
43
44
45
On Linux, OpenCL devices can be listed using the ``clinfo`` command. On Ubuntu, you can install ``clinfo`` by executing ``sudo apt-get install clinfo``.


Examples
===============
46

47
We provide test R code below, but you can use the language of your choice with the examples of your choices:
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77

.. code:: r

    library(lightgbm)
    data(agaricus.train, package = "lightgbm")
    train <- agaricus.train
    train$data[, 1] <- 1:6513
    dtrain <- lgb.Dataset(train$data, label = train$label)
    data(agaricus.test, package = "lightgbm")
    test <- agaricus.test
    dtest <- lgb.Dataset.create.valid(dtrain, test$data, label = test$label)
    valids <- list(test = dtest)

    params <- list(objective = "regression",
                   metric = "rmse",
                   device = "gpu",
                   gpu_platform_id = 0,
                   gpu_device_id = 0,
                   nthread = 1,
                   boost_from_average = FALSE,
                   num_tree_per_iteration = 10,
                   max_bin = 32)
    model <- lgb.train(params,
                       dtrain,
                       2,
                       valids,
                       min_data = 1,
                       learning_rate = 1,
                       early_stopping_rounds = 10)

78
Make sure you list the OpenCL devices in your system and set ``gpu_platform_id`` and ``gpu_device_id`` correctly. In the following examples, our system has 1 GPU platform (``gpu_platform_id = 0``) from AMD APP SDK. The first device ``gpu_device_id = 0`` is a GPU device (AMD Oland), and the second device ``gpu_device_id = 1`` is the x86 CPU backend.
79

80
Example of using GPU (``gpu_platform_id = 0`` and ``gpu_device_id = 0`` in our system):
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

.. code:: r

    > params <- list(objective = "regression",
    +                metric = "rmse",
    +                device = "gpu",
    +                gpu_platform_id = 0,
    +                gpu_device_id = 0,
    +                nthread = 1,
    +                boost_from_average = FALSE,
    +                num_tree_per_iteration = 10,
    +                max_bin = 32)
    > model <- lgb.train(params,
    +                    dtrain,
    +                    2,
    +                    valids,
    +                    min_data = 1,
    +                    learning_rate = 1,
    +                    early_stopping_rounds = 10)
    [LightGBM] [Info] This is the GPU trainer!!
    [LightGBM] [Info] Total Bins 232
    [LightGBM] [Info] Number of data: 6513, number of used features: 116
    [LightGBM] [Info] Using GPU Device: Oland, Vendor: Advanced Micro Devices, Inc.
    [LightGBM] [Info] Compiling OpenCL Kernel with 16 bins...
    [LightGBM] [Info] GPU programs have been built
    [LightGBM] [Info] Size of histogram bin entry: 12
107
    [LightGBM] [Info] 40 dense feature groups (0.12 MB) transferred to GPU in 0.004211 secs. 76 sparse feature groups.
108
    [LightGBM] [Info] No further splits with positive gain, best gain: -inf
109
    [LightGBM] [Info] Trained a tree with leaves=16 and depth=8
110
111
    [1]:    test's rmse:1.10643e-17 
    [LightGBM] [Info] No further splits with positive gain, best gain: -inf
112
    [LightGBM] [Info] Trained a tree with leaves=7 and depth=5
113
114
    [2]:    test's rmse:0

Nikita Titov's avatar
Nikita Titov committed
115
Running on OpenCL CPU backend devices is in generally slow, and we observe crashes on some Windows and macOS systems. Make sure you check the ``Using GPU Device`` line in the log and it is not using a CPU. The above log shows that we are using ``Oland`` GPU from AMD and not CPU.
116
117

Example of using CPU (``gpu_platform_id = 0``, ``gpu_device_id = 1``). The GPU device reported is ``Intel(R) Core(TM) i7-4600U CPU``, so it is using the CPU backend rather than a real GPU.
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144

.. code:: r

    > params <- list(objective = "regression",
    +                metric = "rmse",
    +                device = "gpu",
    +                gpu_platform_id = 0,
    +                gpu_device_id = 1,
    +                nthread = 1,
    +                boost_from_average = FALSE,
    +                num_tree_per_iteration = 10,
    +                max_bin = 32)
    > model <- lgb.train(params,
    +                    dtrain,
    +                    2,
    +                    valids,
    +                    min_data = 1,
    +                    learning_rate = 1,
    +                    early_stopping_rounds = 10)
    [LightGBM] [Info] This is the GPU trainer!!
    [LightGBM] [Info] Total Bins 232
    [LightGBM] [Info] Number of data: 6513, number of used features: 116
    [LightGBM] [Info] Using requested OpenCL platform 0 device 1
    [LightGBM] [Info] Using GPU Device: Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz, Vendor: GenuineIntel
    [LightGBM] [Info] Compiling OpenCL Kernel with 16 bins...
    [LightGBM] [Info] GPU programs have been built
    [LightGBM] [Info] Size of histogram bin entry: 12
145
    [LightGBM] [Info] 40 dense feature groups (0.12 MB) transferred to GPU in 0.004540 secs. 76 sparse feature groups.
146
    [LightGBM] [Info] No further splits with positive gain, best gain: -inf
147
    [LightGBM] [Info] Trained a tree with leaves=16 and depth=8
148
149
    [1]:    test's rmse:1.10643e-17 
    [LightGBM] [Info] No further splits with positive gain, best gain: -inf
150
    [LightGBM] [Info] Trained a tree with leaves=7 and depth=5
151
    [2]:    test's rmse:0
152
    
153

154
Known issues:
155

156
- Using a bad combination of ``gpu_platform_id`` and ``gpu_device_id`` can potentially lead to a **crash** due to OpenCL driver issues on some machines (you will lose your entire session content). Beware of it.
157

158
- On some systems, if you have integrated graphics card (Intel HD Graphics) and a dedicated graphics card (AMD, NVIDIA), the dedicated graphics card will automatically override the integrated graphics card. The workaround is to disable your dedicated graphics card to be able to use your integrated graphics card.
159
160

.. _Intel SDK for OpenCL: https://software.intel.com/en-us/articles/opencl-drivers
161

162
.. _ROCm: https://rocmdocs.amd.com/en/latest/
163

164
165
166
.. _for Linux: https://github.com/microsoft/LightGBM/releases/download/v2.0.12/AMD-APP-SDKInstaller-v3.0.130.136-GA-linux64.tar.bz2

.. _for Windows: https://github.com/microsoft/LightGBM/releases/download/v2.0.12/AMD-APP-SDKInstaller-v3.0.130.135-GA-windows-F-x64.exe
167

168
.. _NVIDIA CUDA Toolkit: https://developer.nvidia.com/cuda-downloads
Huan Zhang's avatar
Huan Zhang committed
169
170
171
172

.. _clinfo: https://github.com/Oblomov/clinfo

.. _GPUCapsViewer: http://www.ozone3d.net/gpu_caps_viewer/
173
174

.. _PoCL: http://portablecl.org/