Experiments.rst 12.2 KB
Newer Older
1
2
3
4
5
6
7
8
Experiments
===========

Comparison Experiment
---------------------

For the detailed experiment scripts and output logs, please refer to this `repo`_.

9
10
11
12
13
14
15
History
^^^^^^^

08 Mar, 2020: update according to the latest master branch (`1b97eaf <https://github.com/dmlc/xgboost/commit/1b97eaf7a74315bfa2c132d59f937a35408bcfd1>`__ for XGBoost, `bcad692 <https://github.com/microsoft/LightGBM/commit/bcad692e263e0317cab11032dd017c78f9e58e5f>`__ for LightGBM). (``xgboost_exact`` is not updated for it is too slow.)

27 Feb, 2017: first version.

16
17
18
Data
^^^^

19
20
We used 5 datasets to conduct our comparison experiments. Details of data are listed in the following table:

21
22
23
24
25
+-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
| Data      | Task                  | Link                                                                            | #Train\_Set | #Feature | Comments                                     |
+===========+=======================+=================================================================================+=============+==========+==============================================+
| Higgs     | Binary classification | `link <https://archive.ics.uci.edu/dataset/280/higgs>`__                        | 10,500,000  | 28       | last 500,000 samples were used as test set   |
+-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
26
| Yahoo LTR | Learning to rank      | `link <https://proceedings.mlr.press/v14/chapelle11a.html>`__                   | 473,134     | 700      | set1.train as train, set1.test as test       |
27
28
29
30
31
32
33
+-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
| MS LTR    | Learning to rank      | `link <https://www.microsoft.com/en-us/research/project/mslr/>`__               | 2,270,296   | 137      | {S1,S2,S3} as train set, {S5} as test set    |
+-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
| Expo      | Binary classification | `link <https://community.amstat.org/jointscsg-section/dataexpo/dataexpo2009>`__ | 11,000,000  | 700      | last 1,000,000 samples were used as test set |
+-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
| Allstate  | Binary classification | `link <https://www.kaggle.com/c/ClaimPredictionChallenge>`__                    | 13,184,290  | 4228     | last 1,000,000 samples were used as test set |
+-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
34
35
36
37

Environment
^^^^^^^^^^^

38
We ran all experiments on a single Linux server (Azure ND24s) with the following specifications:
39

40
41
42
+------------------+-----------------+---------------------+
| OS               | CPU             | Memory              |
+==================+=================+=====================+
43
| Ubuntu 16.04 LTS | 2 \* E5-2690 v4 | 448GB               |
44
+------------------+-----------------+---------------------+
45
46
47
48

Baseline
^^^^^^^^

49
We used `xgboost`_ as a baseline.
50

51
Both xgboost and LightGBM were built with OpenMP support.
52
53
54
55

Settings
^^^^^^^^

56
We set up total 3 settings for experiments. The parameters of these settings are:
57
58
59

1. xgboost:

60
   .. code:: text
61
62
63
64
65
66
67
68
69
70

       eta = 0.1
       max_depth = 8
       num_round = 500
       nthread = 16
       tree_method = exact
       min_child_weight = 100

2. xgboost\_hist (using histogram based algorithm):

71
   .. code:: text
72
73
74
75
76
77
78
79
80
81
82
83

       eta = 0.1
       num_round = 500
       nthread = 16
       min_child_weight = 100
       tree_method = hist
       grow_policy = lossguide
       max_depth = 0
       max_leaves = 255

3. LightGBM:

84
   .. code:: text
85
86
87
88
89
90
91
92

       learning_rate = 0.1
       num_leaves = 255
       num_trees = 500
       num_threads = 16
       min_data_in_leaf = 0
       min_sum_hessian_in_leaf = 100

93
94
xgboost grows trees depth-wise and controls model complexity by ``max_depth``.
LightGBM uses a leaf-wise algorithm instead and controls model complexity by ``num_leaves``.
95
So we cannot compare them in the exact same model setting. For the tradeoff, we use xgboost with ``max_depth=8``, which will have max number leaves to 255, to compare with LightGBM with ``num_leaves=255``.
96
97
98
99
100
101
102
103
104

Other parameters are default values.

Result
^^^^^^

Speed
'''''

105
We compared speed using only the training task without any test or metric output. We didn't count the time for IO.
106
For the ranking tasks, since XGBoost and LightGBM implement different ranking objective functions, we used ``regression`` objective for speed benchmark, for the fair comparison.
107
108
109

The following table is the comparison of time cost:

110
111
112
113
114
115
116
117
118
119
120
121
122
+-----------+-----------+---------------+---------------+
| Data      | xgboost   | xgboost\_hist | LightGBM      |
+===========+===========+===============+===============+
| Higgs     | 3794.34 s | 165.575 s     | **130.094 s** |
+-----------+-----------+---------------+---------------+
| Yahoo LTR | 674.322 s | 131.462 s     | **76.229 s**  |
+-----------+-----------+---------------+---------------+
| MS LTR    | 1251.27 s | 98.386 s      | **70.417 s**  |
+-----------+-----------+---------------+---------------+
| Expo      | 1607.35 s | 137.65 s      | **62.607 s**  |
+-----------+-----------+---------------+---------------+
| Allstate  | 2867.22 s | 315.256 s     | **148.231 s** |
+-----------+-----------+---------------+---------------+
123

124
LightGBM ran faster than xgboost on all experiment data sets.
125
126
127
128

Accuracy
''''''''

129
We computed all accuracy metrics only on the test data set.
130

131
132
133
+-----------+-----------------+----------+-------------------+--------------+
| Data      | Metric          | xgboost  | xgboost\_hist     | LightGBM     |
+===========+=================+==========+===================+==============+
134
| Higgs     | AUC             | 0.839593 | 0.845314          | **0.845724** |
135
+-----------+-----------------+----------+-------------------+--------------+
136
| Yahoo LTR | NDCG\ :sub:`1`  | 0.719748 | 0.720049          | **0.732981** |
137
|           +-----------------+----------+-------------------+--------------+
138
|           | NDCG\ :sub:`3`  | 0.717813 | 0.722573          | **0.735689** |
139
|           +-----------------+----------+-------------------+--------------+
140
|           | NDCG\ :sub:`5`  | 0.737849 | 0.740899          | **0.75352**  |
141
|           +-----------------+----------+-------------------+--------------+
142
|           | NDCG\ :sub:`10` | 0.78089  | 0.782957          | **0.793498** |
143
+-----------+-----------------+----------+-------------------+--------------+
144
| MS LTR    | NDCG\ :sub:`1`  | 0.483956 | 0.485115          | **0.517767** |
145
|           +-----------------+----------+-------------------+--------------+
146
|           | NDCG\ :sub:`3`  | 0.467951 | 0.47313           | **0.501063** |
147
|           +-----------------+----------+-------------------+--------------+
148
|           | NDCG\ :sub:`5`  | 0.472476 | 0.476375          | **0.504648** |
149
|           +-----------------+----------+-------------------+--------------+
150
|           | NDCG\ :sub:`10` | 0.492429 | 0.496553          | **0.524252** |
151
+-----------+-----------------+----------+-------------------+--------------+
152
| Expo      | AUC             | 0.756713 | 0.776224          | **0.776935** |
153
+-----------+-----------------+----------+-------------------+--------------+
154
| Allstate  | AUC             | 0.607201 | **0.609465**      |  0.609072    |
155
+-----------+-----------------+----------+-------------------+--------------+
156
157
158
159

Memory Consumption
''''''''''''''''''

160
161
We monitored RES while running training task. And we set ``two_round=true`` (this will increase data-loading time and
reduce peak memory usage but not affect training speed or accuracy) in LightGBM to reduce peak memory usage.
162

163
164
165
166
167
168
169
170
171
172
173
174
175
+-----------+---------+---------------+--------------------+--------------------+
| Data      | xgboost | xgboost\_hist | LightGBM (col-wise)|LightGBM (row-wise) |
+===========+=========+===============+====================+====================+
| Higgs     | 4.853GB | 7.335GB       | **0.897GB**        |     1.401GB        |
+-----------+---------+---------------+--------------------+--------------------+
| Yahoo LTR | 1.907GB | 4.023GB       | **1.741GB**        |     2.161GB        |
+-----------+---------+---------------+--------------------+--------------------+
| MS LTR    | 5.469GB | 7.491GB       | **0.940GB**        |     1.296GB        |
+-----------+---------+---------------+--------------------+--------------------+
| Expo      | 1.553GB | 2.606GB       | **0.555GB**        |     0.711GB        |
+-----------+---------+---------------+--------------------+--------------------+
| Allstate  | 6.237GB | 12.090GB      | **1.116GB**        |     1.755GB        |
+-----------+---------+---------------+--------------------+--------------------+
176
177
178
179

Parallel Experiment
-------------------

180
181
182
183
184
History
^^^^^^^

27 Feb, 2017: first version.

185
186
187
Data
^^^^

188
We used a terabyte click log dataset to conduct parallel experiments. Details are listed in following table:
189

190
191
192
193
194
+--------+-----------------------+---------+---------------+----------+
| Data   | Task                  | Link    | #Data         | #Feature |
+========+=======================+=========+===============+==========+
| Criteo | Binary classification | `link`_ | 1,700,000,000 | 67       |
+--------+-----------------------+---------+---------------+----------+
195

196
This data contains 13 integer features and 26 categorical features for 24 days of click logs.
Andrew Ziem's avatar
Andrew Ziem committed
197
We statisticized the click-through rate (CTR) and count for these 26 categorical features from the first ten days.
198
Then we used next ten days' data, after replacing the categorical features by the corresponding CTR and count, as training data.
Darío Hereñú's avatar
Darío Hereñú committed
199
The processed training data have a total of 1.7 billions records and 67 features.
200
201
202
203

Environment
^^^^^^^^^^^

204
We ran our experiments on 16 Windows servers with the following specifications:
205

206
207
208
209
210
+---------------------+-----------------+---------------------+-------------------------------------------+
| OS                  | CPU             | Memory              | Network Adapter                           |
+=====================+=================+=====================+===========================================+
| Windows Server 2012 | 2 \* E5-2670 v2 | DDR3 1600Mhz, 256GB | Mellanox ConnectX-3, 54Gbps, RDMA support |
+---------------------+-----------------+---------------------+-------------------------------------------+
211
212
213
214

Settings
^^^^^^^^

215
.. code:: text
216
217
218
219
220
221
222

    learning_rate = 0.1
    num_leaves = 255
    num_trees = 100
    num_thread = 16
    tree_learner = data

223
We used data parallel here because this data is large in ``#data`` but small in ``#feature``. Other parameters were default values.
224

225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
Results
^^^^^^^

+----------+---------------+---------------------------+
| #Machine | Time per Tree | Memory Usage(per Machine) |
+==========+===============+===========================+
| 1        | 627.8 s       | 176GB                     |
+----------+---------------+---------------------------+
| 2        | 311 s         | 87GB                      |
+----------+---------------+---------------------------+
| 4        | 156 s         | 43GB                      |
+----------+---------------+---------------------------+
| 8        | 80 s          | 22GB                      |
+----------+---------------+---------------------------+
| 16       | 42 s          | 11GB                      |
+----------+---------------+---------------------------+
241

242
The results show that LightGBM achieves a linear speedup with distributed learning.
243
244
245
246
247
248
249
250
251
252

GPU Experiments
---------------

Refer to `GPU Performance <./GPU-Performance.rst>`__.

.. _repo: https://github.com/guolinke/boosting_tree_benchmarks

.. _xgboost: https://github.com/dmlc/xgboost

253
.. _link: https://ailab.criteo.com/download-criteo-1tb-click-logs-dataset/