Parameters.rst 61.2 KB
Newer Older
1
..  List of parameters is auto generated by LightGBM\helpers\parameter_generator.py from LightGBM\include\LightGBM\config.h file.
2

3
4
5
.. role:: raw-html(raw)
    :format: html

6
7
8
Parameters
==========

9
This page contains descriptions of all parameters in LightGBM.
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

**List of other helpful links**

- `Python API <./Python-API.rst>`__

- `Parameters Tuning <./Parameters-Tuning.rst>`__

**External Links**

- `Laurae++ Interactive Documentation`_

Parameters Format
-----------------

The parameters format is ``key1=value1 key2=value2 ...``.
25
Parameters can be set both in config file and command line.
26
27
28
By using command line, parameters should not have spaces before and after ``=``.
By using config files, one line can only contain one parameter. You can use ``#`` to comment.

29
If one parameter appears in both command line and config file, LightGBM will use the parameter from the command line.
30

31
32
.. start params list

33
34
35
Core Parameters
---------------

36
-  ``config`` :raw-html:`<a id="config" title="Permalink to this parameter" href="#config">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``config_file``
37
38
39

   -  path of config file

40
   -  **Note**: can be used only in CLI version
41

42
-  ``task`` :raw-html:`<a id="task" title="Permalink to this parameter" href="#task">&#x1F517;&#xFE0E;</a>`, default = ``train``, type = enum, options: ``train``, ``predict``, ``convert_model``, ``refit``, aliases: ``task_type``
43

44
   -  ``train``, for training, aliases: ``training``
45

46
   -  ``predict``, for prediction, aliases: ``prediction``, ``test``
47

48
   -  ``convert_model``, for converting model file into if-else format, see more information in `IO Parameters <#io-parameters>`__
49

50
   -  ``refit``, for refitting existing models with new data, aliases: ``refit_tree``
51

Guolin Ke's avatar
Guolin Ke committed
52
   -  **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent functions
53

54
-  ``objective`` :raw-html:`<a id="objective" title="Permalink to this parameter" href="#objective">&#x1F517;&#xFE0E;</a>`, default = ``regression``, type = enum, options: ``regression``, ``regression_l1``, ``huber``, ``fair``, ``poisson``, ``quantile``, ``mape``, ``gamma``, ``tweedie``, ``binary``, ``multiclass``, ``multiclassova``, ``cross_entropy``, ``cross_entropy_lambda``, ``lambdarank``, ``rank_xendcg``, aliases: ``objective_type``, ``app``, ``application``
55

56
   -  regression application
57

Guolin Ke's avatar
Guolin Ke committed
58
      -  ``regression``, L2 loss, aliases: ``regression_l2``, ``l2``, ``mean_squared_error``, ``mse``, ``l2_root``, ``root_mean_squared_error``, ``rmse``
59

Guolin Ke's avatar
Guolin Ke committed
60
      -  ``regression_l1``, L1 loss, aliases: ``l1``, ``mean_absolute_error``, ``mae``
61

62
      -  ``huber``, `Huber loss <https://en.wikipedia.org/wiki/Huber_loss>`__
63

64
      -  ``fair``, `Fair loss <https://www.kaggle.com/c/allstate-claims-severity/discussion/24520>`__
65

66
      -  ``poisson``, `Poisson regression <https://en.wikipedia.org/wiki/Poisson_regression>`__
67

68
      -  ``quantile``, `Quantile regression <https://en.wikipedia.org/wiki/Quantile_regression>`__
69

70
      -  ``mape``, `MAPE loss <https://en.wikipedia.org/wiki/Mean_absolute_percentage_error>`__, aliases: ``mean_absolute_percentage_error``
71

72
      -  ``gamma``, Gamma regression with log-link. It might be useful, e.g., for modeling insurance claims severity, or for any target that might be `gamma-distributed <https://en.wikipedia.org/wiki/Gamma_distribution#Occurrence_and_applications>`__
Guolin Ke's avatar
Guolin Ke committed
73

74
      -  ``tweedie``, Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any target that might be `tweedie-distributed <https://en.wikipedia.org/wiki/Tweedie_distribution#Occurrence_and_applications>`__
Guolin Ke's avatar
Guolin Ke committed
75

76
   -  ``binary``, binary `log loss <https://en.wikipedia.org/wiki/Cross_entropy>`__ classification (or logistic regression). Requires labels in {0, 1}; see ``cross-entropy`` application for general probability labels in [0, 1]
77
78
79

   -  multi-class classification application

80
      -  ``multiclass``, `softmax <https://en.wikipedia.org/wiki/Softmax_function>`__ objective function, aliases: ``softmax``
81

82
      -  ``multiclassova``, `One-vs-All <https://en.wikipedia.org/wiki/Multiclass_classification#One-vs.-rest>`__ binary objective function, aliases: ``multiclass_ova``, ``ova``, ``ovr``
Nikita Titov's avatar
Nikita Titov committed
83
84

      -  ``num_class`` should be set as well
85
86
87

   -  cross-entropy application

Guolin Ke's avatar
Guolin Ke committed
88
      -  ``cross_entropy``, objective function for cross-entropy (with optional linear weights), aliases: ``xentropy``
89

Guolin Ke's avatar
Guolin Ke committed
90
      -  ``cross_entropy_lambda``, alternative parameterization of cross-entropy, aliases: ``xentlambda``
91

92
      -  label is anything in interval [0, 1]
93

94
   -  ranking application
95

96
      -  ``lambdarank``, `lambdarank <https://papers.nips.cc/paper/2971-learning-to-rank-with-nonsmooth-cost-functions.pdf>`__ objective. `label_gain <#objective-parameters>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain``
97

98
      -  ``rank_xendcg``, `XE_NDCG_MART <https://arxiv.org/abs/1911.09798>`__ ranking objective function. To obtain reproducible results, you should disable parallelism by setting ``num_threads`` to 1, aliases: ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart``
99

100
      -  label should be ``int`` type, and larger number represents the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect)
101

102
-  ``boosting`` :raw-html:`<a id="boosting" title="Permalink to this parameter" href="#boosting">&#x1F517;&#xFE0E;</a>`, default = ``gbdt``, type = enum, options: ``gbdt``, ``rf``, ``dart``, ``goss``, aliases: ``boosting_type``, ``boost``
103

104
   -  ``gbdt``, traditional Gradient Boosting Decision Tree, aliases: ``gbrt``
105

106
   -  ``rf``, Random Forest, aliases: ``random_forest``
107

108
   -  ``dart``, `Dropouts meet Multiple Additive Regression Trees <https://arxiv.org/abs/1505.01866>`__
109
110
111

   -  ``goss``, Gradient-based One-Side Sampling

112
-  ``data`` :raw-html:`<a id="data" title="Permalink to this parameter" href="#data">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``train``, ``train_data``, ``train_data_file``, ``data_filename``
113

114
   -  path of training data, LightGBM will train from this data
115

116
117
   -  **Note**: can be used only in CLI version

118
-  ``valid`` :raw-html:`<a id="valid" title="Permalink to this parameter" href="#valid">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``test``, ``valid_data``, ``valid_data_file``, ``test_data``, ``test_data_file``, ``valid_filenames``
119

120
   -  path(s) of validation/test data, LightGBM will output metrics for these data
121

122
   -  support multiple validation data, separated by ``,``
123

124
125
   -  **Note**: can be used only in CLI version

126
-  ``num_iterations`` :raw-html:`<a id="num_iterations" title="Permalink to this parameter" href="#num_iterations">&#x1F517;&#xFE0E;</a>`, default = ``100``, type = int, aliases: ``num_iteration``, ``n_iter``, ``num_tree``, ``num_trees``, ``num_round``, ``num_rounds``, ``num_boost_round``, ``n_estimators``, constraints: ``num_iterations >= 0``
127
128

   -  number of boosting iterations
129

130
   -  **Note**: internally, LightGBM constructs ``num_class * num_iterations`` trees for multi-class classification problems
131

132
-  ``learning_rate`` :raw-html:`<a id="learning_rate" title="Permalink to this parameter" href="#learning_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.1``, type = double, aliases: ``shrinkage_rate``, ``eta``, constraints: ``learning_rate > 0.0``
133
134
135
136
137

   -  shrinkage rate

   -  in ``dart``, it also affects on normalization weights of dropped trees

138
-  ``num_leaves`` :raw-html:`<a id="num_leaves" title="Permalink to this parameter" href="#num_leaves">&#x1F517;&#xFE0E;</a>`, default = ``31``, type = int, aliases: ``num_leaf``, ``max_leaves``, ``max_leaf``, constraints: ``1 < num_leaves <= 131072``
139

140
   -  max number of leaves in one tree
141

142
-  ``tree_learner`` :raw-html:`<a id="tree_learner" title="Permalink to this parameter" href="#tree_learner">&#x1F517;&#xFE0E;</a>`, default = ``serial``, type = enum, options: ``serial``, ``feature``, ``data``, ``voting``, aliases: ``tree``, ``tree_type``, ``tree_learner_type``
143
144
145

   -  ``serial``, single machine tree learner

146
   -  ``feature``, feature parallel tree learner, aliases: ``feature_parallel``
147

148
   -  ``data``, data parallel tree learner, aliases: ``data_parallel``
149

150
   -  ``voting``, voting parallel tree learner, aliases: ``voting_parallel``
151
152
153

   -  refer to `Parallel Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details

154
-  ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs``
155
156
157

   -  number of threads for LightGBM

158
   -  ``0`` means default number of threads in OpenMP
159

160
   -  for the best speed, set this to the number of **real CPU cores**, not the number of threads (most CPUs use `hyper-threading <https://en.wikipedia.org/wiki/Hyper-threading>`__ to generate 2 threads per CPU core)
161

162
   -  do not set it too large if your dataset is small (for instance, do not use 64 threads for a dataset with 10,000 rows)
163

164
   -  be aware a task manager or any similar CPU monitoring tool might report that cores not being fully utilized. **This is normal**
165

166
   -  for parallel learning, do not use all CPU cores because this will cause poor performance for the network communication
167

168
-  ``device_type`` :raw-html:`<a id="device_type" title="Permalink to this parameter" href="#device_type">&#x1F517;&#xFE0E;</a>`, default = ``cpu``, type = enum, options: ``cpu``, ``gpu``, aliases: ``device``
169
170

   -  device for the tree learning, you can use GPU to achieve the faster learning
171
172
173

   -  **Note**: it is recommended to use the smaller ``max_bin`` (e.g. 63) to get the better speed up

174
175
176
177
   -  **Note**: for the faster speed, GPU uses 32-bit float point to sum up by default, so this may affect the accuracy for some tasks. You can set ``gpu_use_dp=true`` to enable 64-bit float point, but it will slow down the training

   -  **Note**: refer to `Installation Guide <./Installation-Guide.rst#build-gpu-version>`__ to build LightGBM with GPU support

178
-  ``seed`` :raw-html:`<a id="seed" title="Permalink to this parameter" href="#seed">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = int, aliases: ``random_seed``, ``random_state``
179

180
   -  this seed is used to generate other seeds, e.g. ``data_random_seed``, ``feature_fraction_seed``, etc.
181

182
183
184
   -  by default, this seed is unused in favor of default values of other seeds

   -  this seed has lower priority in comparison with other seeds, which means that it will be overridden, if you set other seeds explicitly
185
186
187
188

Learning Control Parameters
---------------------------

189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
-  ``force_col_wise`` :raw-html:`<a id="force_col_wise" title="Permalink to this parameter" href="#force_col_wise">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  set ``force_col_wise=true`` will force LightGBM to use col-wise histogram build

   -  Recommend ``force_col_wise=true`` when:

      -  the number of columns is large, or the total number of bin is large

      -  when ``num_threads`` is large, e.g. ``>20``

      -  want to use small ``feature_fraction``, e.g. ``0.5``, to speed-up

      -  want to reduce memory cost

   -  when both ``force_col_wise`` and ``force_col_wise`` are ``false``, LightGBM will firstly try them both, and uses the faster one

-  ``force_row_wise`` :raw-html:`<a id="force_row_wise" title="Permalink to this parameter" href="#force_row_wise">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  set ``force_row_wise=true`` will force LightGBM to use row-wise histogram build

   -  Recommend ``force_row_wise=true`` when:

      -  the number of data is large, and the number of total bin is relatively small

      -  want to use small ``bagging``, or ``goss``, to speed-up

      -  when ``num_threads`` is relatively small, e.g. ``<=16``

   -  set ``force_row_wise=true`` will double the memory cost for Dataset object, if your memory is not enough, you can try ``force_col_wise=true``

   -  when both ``force_col_wise`` and ``force_col_wise`` are ``false``, LightGBM will firstly try them both, and uses the faster one.

221
-  ``max_depth`` :raw-html:`<a id="max_depth" title="Permalink to this parameter" href="#max_depth">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int
222

223
   -  limit the max depth for tree model. This is used to deal with over-fitting when ``#data`` is small. Tree still grows leaf-wise
224

225
   -  ``<= 0`` means no limit
226

227
-  ``min_data_in_leaf`` :raw-html:`<a id="min_data_in_leaf" title="Permalink to this parameter" href="#min_data_in_leaf">&#x1F517;&#xFE0E;</a>`, default = ``20``, type = int, aliases: ``min_data_per_leaf``, ``min_data``, ``min_child_samples``, constraints: ``min_data_in_leaf >= 0``
228
229
230

   -  minimal number of data in one leaf. Can be used to deal with over-fitting

231
-  ``min_sum_hessian_in_leaf`` :raw-html:`<a id="min_sum_hessian_in_leaf" title="Permalink to this parameter" href="#min_sum_hessian_in_leaf">&#x1F517;&#xFE0E;</a>`, default = ``1e-3``, type = double, aliases: ``min_sum_hessian_per_leaf``, ``min_sum_hessian``, ``min_hessian``, ``min_child_weight``, constraints: ``min_sum_hessian_in_leaf >= 0.0``
232
233
234

   -  minimal sum hessian in one leaf. Like ``min_data_in_leaf``, it can be used to deal with over-fitting

235
-  ``bagging_fraction`` :raw-html:`<a id="bagging_fraction" title="Permalink to this parameter" href="#bagging_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``sub_row``, ``subsample``, ``bagging``, constraints: ``0.0 < bagging_fraction <= 1.0``
236

237
   -  like ``feature_fraction``, but this will randomly select part of data without resampling
238
239
240
241
242

   -  can be used to speed up training

   -  can be used to deal with over-fitting

243
   -  **Note**: to enable bagging, ``bagging_freq`` should be set to a non zero value as well
244

Guolin Ke's avatar
Guolin Ke committed
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
-  ``pos_bagging_fraction`` :raw-html:`<a id="pos_bagging_fraction" title="Permalink to this parameter" href="#pos_bagging_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``pos_sub_row``, ``pos_subsample``, ``pos_bagging``, constraints: ``0.0 < pos_bagging_fraction <= 1.0``

   -  used only in ``binary`` application

   -  used for imbalanced binary classification problem, will randomly sample ``#pos_samples * pos_bagging_fraction`` positive samples in bagging

   -  should be used together with ``neg_bagging_fraction``

   -  set this to ``1.0`` to disable

   -  **Note**: to enable this, you need to set ``bagging_freq`` and ``neg_bagging_fraction`` as well

   -  **Note**: if both ``pos_bagging_fraction`` and ``neg_bagging_fraction`` are set to ``1.0``,  balanced bagging is disabled

   -  **Note**: if balanced bagging is enabled, ``bagging_fraction`` will be ignored

-  ``neg_bagging_fraction`` :raw-html:`<a id="neg_bagging_fraction" title="Permalink to this parameter" href="#neg_bagging_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``neg_sub_row``, ``neg_subsample``, ``neg_bagging``, constraints: ``0.0 < neg_bagging_fraction <= 1.0``

   -  used only in ``binary`` application

   -  used for imbalanced binary classification problem, will randomly sample ``#neg_samples * neg_bagging_fraction`` negative samples in bagging

   -  should be used together with ``pos_bagging_fraction``

   -  set this to ``1.0`` to disable

   -  **Note**: to enable this, you need to set ``bagging_freq`` and ``pos_bagging_fraction`` as well

   -  **Note**: if both ``pos_bagging_fraction`` and ``neg_bagging_fraction`` are set to ``1.0``,  balanced bagging is disabled

   -  **Note**: if balanced bagging is enabled, ``bagging_fraction`` will be ignored

277
-  ``bagging_freq`` :raw-html:`<a id="bagging_freq" title="Permalink to this parameter" href="#bagging_freq">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``subsample_freq``
278

279
   -  frequency for bagging
280

281
282
283
284
   -  ``0`` means disable bagging; ``k`` means perform bagging at every ``k`` iteration

   -  **Note**: to enable bagging, ``bagging_fraction`` should be set to value smaller than ``1.0`` as well

285
-  ``bagging_seed`` :raw-html:`<a id="bagging_seed" title="Permalink to this parameter" href="#bagging_seed">&#x1F517;&#xFE0E;</a>`, default = ``3``, type = int, aliases: ``bagging_fraction_seed``
286
287
288

   -  random seed for bagging

289
-  ``feature_fraction`` :raw-html:`<a id="feature_fraction" title="Permalink to this parameter" href="#feature_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``sub_feature``, ``colsample_bytree``, constraints: ``0.0 < feature_fraction <= 1.0``
290

291
   -  LightGBM will randomly select part of features on each iteration (tree) if ``feature_fraction`` smaller than ``1.0``. For example, if you set it to ``0.8``, LightGBM will select 80% of features before training each tree
292

293
   -  can be used to speed up training
294

295
   -  can be used to deal with over-fitting
296

297
-  ``feature_fraction_bynode`` :raw-html:`<a id="feature_fraction_bynode" title="Permalink to this parameter" href="#feature_fraction_bynode">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``sub_feature_bynode``, ``colsample_bynode``, constraints: ``0.0 < feature_fraction_bynode <= 1.0``
298

Nikita Titov's avatar
Nikita Titov committed
299
   -  LightGBM will randomly select part of features on each tree node if ``feature_fraction_bynode`` smaller than ``1.0``. For example, if you set it to ``0.8``, LightGBM will select 80% of features at each tree node
300
301
302

   -  can be used to deal with over-fitting

303
304
305
306
   -  **Note**: unlike ``feature_fraction``, this cannot speed up training

   -  **Note**: if both ``feature_fraction`` and ``feature_fraction_bynode`` are smaller than ``1.0``, the final fraction of each node is ``feature_fraction * feature_fraction_bynode``

307
-  ``feature_fraction_seed`` :raw-html:`<a id="feature_fraction_seed" title="Permalink to this parameter" href="#feature_fraction_seed">&#x1F517;&#xFE0E;</a>`, default = ``2``, type = int
308
309

   -  random seed for ``feature_fraction``
310

311
312
313
314
315
316
317
318
319
320
321
322
-  ``extra_trees`` :raw-html:`<a id="extra_trees" title="Permalink to this parameter" href="#extra_trees">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  use extremely randomized trees

   -  if set to ``true``, when evaluating node splits LightGBM will check only one randomly-chosen threshold for each feature

   -  can be used to deal with over-fitting

-  ``extra_seed`` :raw-html:`<a id="extra_seed" title="Permalink to this parameter" href="#extra_seed">&#x1F517;&#xFE0E;</a>`, default = ``6``, type = int

   -  random seed for selecting thresholds when ``extra_trees`` is true

323
-  ``early_stopping_round`` :raw-html:`<a id="early_stopping_round" title="Permalink to this parameter" href="#early_stopping_round">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``early_stopping_rounds``, ``early_stopping``, ``n_iter_no_change``
324

325
   -  will stop training if one metric of one validation data doesn't improve in last ``early_stopping_round`` rounds
326

327
   -  ``<= 0`` means disable
328

329
330
331
332
-  ``first_metric_only`` :raw-html:`<a id="first_metric_only" title="Permalink to this parameter" href="#first_metric_only">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  set this to ``true``, if you want to use only the first metric for early stopping

333
-  ``max_delta_step`` :raw-html:`<a id="max_delta_step" title="Permalink to this parameter" href="#max_delta_step">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``max_tree_output``, ``max_leaf_output``
334

335
   -  used to limit the max output of tree leaves
336

337
   -  ``<= 0`` means no constraint
338

339
   -  the final max output of leaves is ``learning_rate * max_delta_step``
340

341
-  ``lambda_l1`` :raw-html:`<a id="lambda_l1" title="Permalink to this parameter" href="#lambda_l1">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``reg_alpha``, constraints: ``lambda_l1 >= 0.0``
342
343
344

   -  L1 regularization

345
-  ``lambda_l2`` :raw-html:`<a id="lambda_l2" title="Permalink to this parameter" href="#lambda_l2">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``reg_lambda``, ``lambda``, constraints: ``lambda_l2 >= 0.0``
346
347
348

   -  L2 regularization

349
-  ``min_gain_to_split`` :raw-html:`<a id="min_gain_to_split" title="Permalink to this parameter" href="#min_gain_to_split">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``min_split_gain``, constraints: ``min_gain_to_split >= 0.0``
350

351
   -  the minimal gain to perform split
352

353
-  ``drop_rate`` :raw-html:`<a id="drop_rate" title="Permalink to this parameter" href="#drop_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.1``, type = double, aliases: ``rate_drop``, constraints: ``0.0 <= drop_rate <= 1.0``
354

355
   -  used only in ``dart``
356

357
   -  dropout rate: a fraction of previous trees to drop during the dropout
358

359
-  ``max_drop`` :raw-html:`<a id="max_drop" title="Permalink to this parameter" href="#max_drop">&#x1F517;&#xFE0E;</a>`, default = ``50``, type = int
360

361
   -  used only in ``dart``
362

363
   -  max number of dropped trees during one boosting iteration
364

365
   -  ``<=0`` means no limit
366

367
-  ``skip_drop`` :raw-html:`<a id="skip_drop" title="Permalink to this parameter" href="#skip_drop">&#x1F517;&#xFE0E;</a>`, default = ``0.5``, type = double, constraints: ``0.0 <= skip_drop <= 1.0``
368

369
   -  used only in ``dart``
370

371
   -  probability of skipping the dropout procedure during a boosting iteration
372

373
-  ``xgboost_dart_mode`` :raw-html:`<a id="xgboost_dart_mode" title="Permalink to this parameter" href="#xgboost_dart_mode">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
374

375
   -  used only in ``dart``
376

377
   -  set this to ``true``, if you want to use xgboost dart mode
378

379
-  ``uniform_drop`` :raw-html:`<a id="uniform_drop" title="Permalink to this parameter" href="#uniform_drop">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
380

381
   -  used only in ``dart``
382

383
   -  set this to ``true``, if you want to use uniform drop
384

385
-  ``drop_seed`` :raw-html:`<a id="drop_seed" title="Permalink to this parameter" href="#drop_seed">&#x1F517;&#xFE0E;</a>`, default = ``4``, type = int
386

387
   -  used only in ``dart``
388

389
   -  random seed to choose dropping models
390

391
-  ``top_rate`` :raw-html:`<a id="top_rate" title="Permalink to this parameter" href="#top_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.2``, type = double, constraints: ``0.0 <= top_rate <= 1.0``
392

393
   -  used only in ``goss``
394

395
   -  the retain ratio of large gradient data
396

397
-  ``other_rate`` :raw-html:`<a id="other_rate" title="Permalink to this parameter" href="#other_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.1``, type = double, constraints: ``0.0 <= other_rate <= 1.0``
398

399
   -  used only in ``goss``
400

401
402
   -  the retain ratio of small gradient data

403
-  ``min_data_per_group`` :raw-html:`<a id="min_data_per_group" title="Permalink to this parameter" href="#min_data_per_group">&#x1F517;&#xFE0E;</a>`, default = ``100``, type = int, constraints: ``min_data_per_group > 0``
404
405

   -  minimal number of data per categorical group
406

407
-  ``max_cat_threshold`` :raw-html:`<a id="max_cat_threshold" title="Permalink to this parameter" href="#max_cat_threshold">&#x1F517;&#xFE0E;</a>`, default = ``32``, type = int, constraints: ``max_cat_threshold > 0``
408

409
   -  used for the categorical features
410

411
   -  limit the max threshold points in categorical features
412

413
-  ``cat_l2`` :raw-html:`<a id="cat_l2" title="Permalink to this parameter" href="#cat_l2">&#x1F517;&#xFE0E;</a>`, default = ``10.0``, type = double, constraints: ``cat_l2 >= 0.0``
414
415

   -  used for the categorical features
Guolin Ke's avatar
Guolin Ke committed
416

417
   -  L2 regularization in categorical split
418

419
-  ``cat_smooth`` :raw-html:`<a id="cat_smooth" title="Permalink to this parameter" href="#cat_smooth">&#x1F517;&#xFE0E;</a>`, default = ``10.0``, type = double, constraints: ``cat_smooth >= 0.0``
420
421
422
423
424

   -  used for the categorical features

   -  this can reduce the effect of noises in categorical features, especially for categories with few data

425
-  ``max_cat_to_onehot`` :raw-html:`<a id="max_cat_to_onehot" title="Permalink to this parameter" href="#max_cat_to_onehot">&#x1F517;&#xFE0E;</a>`, default = ``4``, type = int, constraints: ``max_cat_to_onehot > 0``
426

427
428
   -  when number of categories of one feature smaller than or equal to ``max_cat_to_onehot``, one-vs-other split algorithm will be used

429
-  ``top_k`` :raw-html:`<a id="top_k" title="Permalink to this parameter" href="#top_k">&#x1F517;&#xFE0E;</a>`, default = ``20``, type = int, aliases: ``topk``, constraints: ``top_k > 0``
430
431
432
433

   -  used in `Voting parallel <./Parallel-Learning-Guide.rst#choose-appropriate-parallel-algorithm>`__

   -  set this to larger value for more accurate result, but it will slow down the training speed
434

435
-  ``monotone_constraints`` :raw-html:`<a id="monotone_constraints" title="Permalink to this parameter" href="#monotone_constraints">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-int, aliases: ``mc``, ``monotone_constraint``
Guolin Ke's avatar
Guolin Ke committed
436

437
   -  used for constraints of monotonic features
Guolin Ke's avatar
Guolin Ke committed
438

439
   -  ``1`` means increasing, ``-1`` means decreasing, ``0`` means non-constraint
Guolin Ke's avatar
Guolin Ke committed
440

441
442
   -  you need to specify all features in order. For example, ``mc=-1,0,1`` means decreasing for 1st feature, non-constraint for 2nd feature and increasing for the 3rd feature

443
-  ``feature_contri`` :raw-html:`<a id="feature_contri" title="Permalink to this parameter" href="#feature_contri">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-double, aliases: ``feature_contrib``, ``fc``, ``fp``, ``feature_penalty``
Guolin Ke's avatar
Guolin Ke committed
444
445
446
447
448

   -  used to control feature's split gain, will use ``gain[i] = max(0, feature_contri[i]) * gain[i]`` to replace the split gain of i-th feature

   -  you need to specify all features in order

449
-  ``forcedsplits_filename`` :raw-html:`<a id="forcedsplits_filename" title="Permalink to this parameter" href="#forcedsplits_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``fs``, ``forced_splits_filename``, ``forced_splits_file``, ``forced_splits``
450
451
452
453
454
455
456

   -  path to a ``.json`` file that specifies splits to force at the top of every decision tree before best-first learning commences

   -  ``.json`` file can be arbitrarily nested, and each split contains ``feature``, ``threshold`` fields, as well as ``left`` and ``right`` fields representing subsplits

   -  categorical splits are forced in a one-hot fashion, with ``left`` representing the split containing the feature value and ``right`` representing other values

457
458
   -  **Note**: the forced split logic will be ignored, if the split makes gain worse

459
   -  see `this file <https://github.com/microsoft/LightGBM/tree/master/examples/binary_classification/forced_splits.json>`__ as an example
Guolin Ke's avatar
Guolin Ke committed
460

461
462
463
464
465
466
467
468
-  ``forcedbins_filename`` :raw-html:`<a id="forcedbins_filename" title="Permalink to this parameter" href="#forcedbins_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string

   -  path to a ``.json`` file that specifies bin upper bounds for some or all features

   -  ``.json`` file should contain an array of objects, each containing the word ``feature`` (integer feature index) and ``bin_upper_bound`` (array of thresholds for binning)

   -  see `this file <https://github.com/microsoft/LightGBM/tree/master/examples/regression/forced_bins.json>`__ as an example

Guolin Ke's avatar
Guolin Ke committed
469
470
471
472
473
474
-  ``refit_decay_rate`` :raw-html:`<a id="refit_decay_rate" title="Permalink to this parameter" href="#refit_decay_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.9``, type = double, constraints: ``0.0 <= refit_decay_rate <= 1.0``

   -  decay rate of ``refit`` task, will use ``leaf_output = refit_decay_rate * old_leaf_output + (1.0 - refit_decay_rate) * new_leaf_output`` to refit trees

   -  used only in ``refit`` task in CLI version or as argument in ``refit`` function in language-specific package

475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
-  ``cegb_tradeoff`` :raw-html:`<a id="cegb_tradeoff" title="Permalink to this parameter" href="#cegb_tradeoff">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``cegb_tradeoff >= 0.0``

   -  cost-effective gradient boosting multiplier for all penalties

-  ``cegb_penalty_split`` :raw-html:`<a id="cegb_penalty_split" title="Permalink to this parameter" href="#cegb_penalty_split">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, constraints: ``cegb_penalty_split >= 0.0``

   -  cost-effective gradient-boosting penalty for splitting a node

-  ``cegb_penalty_feature_lazy`` :raw-html:`<a id="cegb_penalty_feature_lazy" title="Permalink to this parameter" href="#cegb_penalty_feature_lazy">&#x1F517;&#xFE0E;</a>`, default = ``0,0,...,0``, type = multi-double

   -  cost-effective gradient boosting penalty for using a feature

   -  applied per data point

-  ``cegb_penalty_feature_coupled`` :raw-html:`<a id="cegb_penalty_feature_coupled" title="Permalink to this parameter" href="#cegb_penalty_feature_coupled">&#x1F517;&#xFE0E;</a>`, default = ``0,0,...,0``, type = multi-double

   -  cost-effective gradient boosting penalty for using a feature

   -  applied once per forest

495
496
497
IO Parameters
-------------

498
-  ``verbosity`` :raw-html:`<a id="verbosity" title="Permalink to this parameter" href="#verbosity">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``verbose``
499
500
501

   -  controls the level of LightGBM's verbosity

502
   -  ``< 0``: Fatal, ``= 0``: Error (Warning), ``= 1``: Info, ``> 1``: Debug
503

504
-  ``max_bin`` :raw-html:`<a id="max_bin" title="Permalink to this parameter" href="#max_bin">&#x1F517;&#xFE0E;</a>`, default = ``255``, type = int, constraints: ``max_bin > 1``
505
506
507
508
509
510
511

   -  max number of bins that feature values will be bucketed in

   -  small number of bins may reduce training accuracy but may increase general power (deal with over-fitting)

   -  LightGBM will auto compress memory according to ``max_bin``. For example, LightGBM will use ``uint8_t`` for feature value if ``max_bin=255``

Guolin Ke's avatar
Guolin Ke committed
512
513
514
515
-  ``is_enable_sparse`` :raw-html:`<a id="is_enable_sparse" title="Permalink to this parameter" href="#is_enable_sparse">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool, aliases: ``is_sparse``, ``enable_sparse``, ``sparse``

   -  used to enable/disable sparse optimization

Belinda Trotta's avatar
Belinda Trotta committed
516
517
518
519
520
521
-  ``max_bin_by_feature`` :raw-html:`<a id="max_bin_by_feature" title="Permalink to this parameter" href="#max_bin_by_feature">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-int

   -  max number of bins for each feature

   -  if not specified, will use ``max_bin`` for all features

522
-  ``min_data_in_bin`` :raw-html:`<a id="min_data_in_bin" title="Permalink to this parameter" href="#min_data_in_bin">&#x1F517;&#xFE0E;</a>`, default = ``3``, type = int, constraints: ``min_data_in_bin > 0``
523
524
525
526

   -  minimal number of data inside one bin

   -  use this to avoid one-data-one-bin (potential over-fitting)
527

528
-  ``bin_construct_sample_cnt`` :raw-html:`<a id="bin_construct_sample_cnt" title="Permalink to this parameter" href="#bin_construct_sample_cnt">&#x1F517;&#xFE0E;</a>`, default = ``200000``, type = int, aliases: ``subsample_for_bin``, constraints: ``bin_construct_sample_cnt > 0``
529

530
531
532
533
534
535
   -  number of data that sampled to construct histogram bins

   -  setting this to larger value will give better training result, but will increase data loading time

   -  set this to larger value if data is very sparse

536
-  ``histogram_pool_size`` :raw-html:`<a id="histogram_pool_size" title="Permalink to this parameter" href="#histogram_pool_size">&#x1F517;&#xFE0E;</a>`, default = ``-1.0``, type = double, aliases: ``hist_pool_size``
537

538
   -  max cache size in MB for historical histogram
539

540
541
   -  ``< 0`` means no limit

542
-  ``data_random_seed`` :raw-html:`<a id="data_random_seed" title="Permalink to this parameter" href="#data_random_seed">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``data_seed``
543
544

   -  random seed for data partition in parallel learning (excluding the ``feature_parallel`` mode)
545

546
-  ``output_model`` :raw-html:`<a id="output_model" title="Permalink to this parameter" href="#output_model">&#x1F517;&#xFE0E;</a>`, default = ``LightGBM_model.txt``, type = string, aliases: ``model_output``, ``model_out``
547

548
   -  filename of output model in training
549

550
551
   -  **Note**: can be used only in CLI version

552
-  ``snapshot_freq`` :raw-html:`<a id="snapshot_freq" title="Permalink to this parameter" href="#snapshot_freq">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int, aliases: ``save_period``
553

554
   -  frequency of saving model file snapshot
555

556
   -  set this to positive value to enable this function. For example, the model file will be snapshotted at each iteration if ``snapshot_freq=1``
557

558
559
   -  **Note**: can be used only in CLI version

560
-  ``input_model`` :raw-html:`<a id="input_model" title="Permalink to this parameter" href="#input_model">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``model_input``, ``model_in``
561

562
563
564
   -  filename of input model

   -  for ``prediction`` task, this model will be applied to prediction data
565
566
567

   -  for ``train`` task, training will be continued from this model

568
569
   -  **Note**: can be used only in CLI version

570
-  ``output_result`` :raw-html:`<a id="output_result" title="Permalink to this parameter" href="#output_result">&#x1F517;&#xFE0E;</a>`, default = ``LightGBM_predict_result.txt``, type = string, aliases: ``predict_result``, ``prediction_result``, ``predict_name``, ``prediction_name``, ``pred_name``, ``name_pred``
571

572
   -  filename of prediction result in ``prediction`` task
573

574
575
   -  **Note**: can be used only in CLI version

576
-  ``initscore_filename`` :raw-html:`<a id="initscore_filename" title="Permalink to this parameter" href="#initscore_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``init_score_filename``, ``init_score_file``, ``init_score``, ``input_init_score``
577

578
   -  path of file with training initial scores
579
580
581

   -  if ``""``, will use ``train_data_file`` + ``.init`` (if exists)

582
   -  **Note**: works only in case of loading data directly from file
583

584
-  ``valid_data_initscores`` :raw-html:`<a id="valid_data_initscores" title="Permalink to this parameter" href="#valid_data_initscores">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``valid_data_init_scores``, ``valid_init_score_file``, ``valid_init_score``
585

586
   -  path(s) of file(s) with validation initial scores
587
588
589
590
591

   -  if ``""``, will use ``valid_data_file`` + ``.init`` (if exists)

   -  separate by ``,`` for multi-validation data

592
   -  **Note**: works only in case of loading data directly from file
593

594
-  ``pre_partition`` :raw-html:`<a id="pre_partition" title="Permalink to this parameter" href="#pre_partition">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_pre_partition``
595
596

   -  used for parallel learning (excluding the ``feature_parallel`` mode)
597
598
599

   -  ``true`` if training data are pre-partitioned, and different machines use different partitions

600
-  ``enable_bundle`` :raw-html:`<a id="enable_bundle" title="Permalink to this parameter" href="#enable_bundle">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool, aliases: ``is_enable_bundle``, ``bundle``
601
602
603
604
605

   -  set this to ``false`` to disable Exclusive Feature Bundling (EFB), which is described in `LightGBM: A Highly Efficient Gradient Boosting Decision Tree <https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree>`__

   -  **Note**: disabling this may cause the slow training speed for sparse datasets

606
-  ``use_missing`` :raw-html:`<a id="use_missing" title="Permalink to this parameter" href="#use_missing">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
607
608

   -  set this to ``false`` to disable the special handle of missing value
609

610
-  ``zero_as_missing`` :raw-html:`<a id="zero_as_missing" title="Permalink to this parameter" href="#zero_as_missing">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
611

612
   -  set this to ``true`` to treat all zero as missing values (including the unshown values in LibSVM / sparse matrices)
613

614
615
   -  set this to ``false`` to use ``na`` for representing missing values

616
-  ``two_round`` :raw-html:`<a id="two_round" title="Permalink to this parameter" href="#two_round">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``two_round_loading``, ``use_two_round_loading``
617
618
619

   -  set this to ``true`` if data file is too big to fit in memory

620
621
   -  by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big

622
623
   -  **Note**: works only in case of loading data directly from file

624
-  ``save_binary`` :raw-html:`<a id="save_binary" title="Permalink to this parameter" href="#save_binary">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_save_binary``, ``is_save_binary_file``
625
626

   -  if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time
627

628
629
   -  **Note**: ``init_score`` is not saved in binary file

630
   -  **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent function
631

632
-  ``header`` :raw-html:`<a id="header" title="Permalink to this parameter" href="#header">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``has_header``
633
634
635

   -  set this to ``true`` if input data has header

636
637
   -  **Note**: works only in case of loading data directly from file

638
-  ``label_column`` :raw-html:`<a id="label_column" title="Permalink to this parameter" href="#label_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``label``
639

640
   -  used to specify the label column
641
642
643
644
645

   -  use number for index, e.g. ``label=0`` means column\_0 is the label

   -  add a prefix ``name:`` for column name, e.g. ``label=name:is_click``

646
647
   -  **Note**: works only in case of loading data directly from file

648
-  ``weight_column`` :raw-html:`<a id="weight_column" title="Permalink to this parameter" href="#weight_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``weight``
649

650
   -  used to specify the weight column
651
652
653
654
655

   -  use number for index, e.g. ``weight=0`` means column\_0 is the weight

   -  add a prefix ``name:`` for column name, e.g. ``weight=name:weight``

656
657
   -  **Note**: works only in case of loading data directly from file

658
   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
659

660
-  ``group_column`` :raw-html:`<a id="group_column" title="Permalink to this parameter" href="#group_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``group``, ``group_id``, ``query_column``, ``query``, ``query_id``
661

662
   -  used to specify the query/group id column
663
664
665
666
667

   -  use number for index, e.g. ``query=0`` means column\_0 is the query id

   -  add a prefix ``name:`` for column name, e.g. ``query=name:query_id``

668
669
   -  **Note**: works only in case of loading data directly from file

670
   -  **Note**: data should be grouped by query\_id
671

672
   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
673

674
-  ``ignore_column`` :raw-html:`<a id="ignore_column" title="Permalink to this parameter" href="#ignore_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = multi-int or string, aliases: ``ignore_feature``, ``blacklist``
675
676

   -  used to specify some ignoring columns in training
677
678
679
680
681

   -  use number for index, e.g. ``ignore_column=0,1,2`` means column\_0, column\_1 and column\_2 will be ignored

   -  add a prefix ``name:`` for column name, e.g. ``ignore_column=name:c1,c2,c3`` means c1, c2 and c3 will be ignored

682
   -  **Note**: works only in case of loading data directly from file
683

684
   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
685

686
687
   -  **Note**: despite the fact that specified columns will be completely ignored during the training, they still should have a valid format allowing LightGBM to load file successfully

688
-  ``categorical_feature`` :raw-html:`<a id="categorical_feature" title="Permalink to this parameter" href="#categorical_feature">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = multi-int or string, aliases: ``cat_feature``, ``categorical_column``, ``cat_column``
689

690
   -  used to specify categorical features
691
692
693
694
695

   -  use number for index, e.g. ``categorical_feature=0,1,2`` means column\_0, column\_1 and column\_2 are categorical features

   -  add a prefix ``name:`` for column name, e.g. ``categorical_feature=name:c1,c2,c3`` means c1, c2 and c3 are categorical features

696
   -  **Note**: only supports categorical with ``int`` type (not applicable for data represented as pandas DataFrame in Python-package)
697
698

   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
699

700
701
   -  **Note**: all values should be less than ``Int32.MaxValue`` (2147483647)

702
   -  **Note**: using large values could be memory consuming. Tree decision rule works best when categorical features are presented by consecutive integers starting from zero
703

704
   -  **Note**: all negative values will be treated as **missing values**
705

706
707
   -  **Note**: the output cannot be monotonically constrained with respect to a categorical feature

708
-  ``predict_raw_score`` :raw-html:`<a id="predict_raw_score" title="Permalink to this parameter" href="#predict_raw_score">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_predict_raw_score``, ``predict_rawscore``, ``raw_score``
709

710
   -  used only in ``prediction`` task
711

712
   -  set this to ``true`` to predict only the raw scores
713

714
   -  set this to ``false`` to predict transformed scores
715

716
-  ``predict_leaf_index`` :raw-html:`<a id="predict_leaf_index" title="Permalink to this parameter" href="#predict_leaf_index">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_predict_leaf_index``, ``leaf_index``
717

718
   -  used only in ``prediction`` task
719

720
   -  set this to ``true`` to predict with leaf index of all trees
721

722
-  ``predict_contrib`` :raw-html:`<a id="predict_contrib" title="Permalink to this parameter" href="#predict_contrib">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_predict_contrib``, ``contrib``
723

724
   -  used only in ``prediction`` task
725

726
   -  set this to ``true`` to estimate `SHAP values <https://arxiv.org/abs/1706.06060>`__, which represent how each feature contributes to each prediction
727

728
   -  produces ``#features + 1`` values where the last value is the expected value of the model output over the training data
729

730
731
   -  **Note**: if you want to get more explanation for your model's predictions using SHAP values like SHAP interaction values, you can install `shap package <https://github.com/slundberg/shap>`__

Nikita Titov's avatar
Nikita Titov committed
732
   -  **Note**: unlike the shap package, with ``predict_contrib`` we return a matrix with an extra column, where the last column is the expected value
733

734
-  ``num_iteration_predict`` :raw-html:`<a id="num_iteration_predict" title="Permalink to this parameter" href="#num_iteration_predict">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int
735

736
   -  used only in ``prediction`` task
737

738
   -  used to specify how many trained iterations will be used in prediction
739

740
   -  ``<= 0`` means no limit
741

742
-  ``pred_early_stop`` :raw-html:`<a id="pred_early_stop" title="Permalink to this parameter" href="#pred_early_stop">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
743

744
   -  used only in ``prediction`` task
745

746
   -  if ``true``, will use early-stopping to speed up the prediction. May affect the accuracy
747

748
-  ``pred_early_stop_freq`` :raw-html:`<a id="pred_early_stop_freq" title="Permalink to this parameter" href="#pred_early_stop_freq">&#x1F517;&#xFE0E;</a>`, default = ``10``, type = int
749

750
   -  used only in ``prediction`` task
751
752
753

   -  the frequency of checking early-stopping prediction

754
-  ``pred_early_stop_margin`` :raw-html:`<a id="pred_early_stop_margin" title="Permalink to this parameter" href="#pred_early_stop_margin">&#x1F517;&#xFE0E;</a>`, default = ``10.0``, type = double
755
756

   -  used only in ``prediction`` task
757
758
759

   -  the threshold of margin in early-stopping prediction

760
761
762
763
764
765
766
767
768
769
770
771
-  ``predict_disable_shape_check`` :raw-html:`<a id="predict_disable_shape_check" title="Permalink to this parameter" href="#predict_disable_shape_check">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  used only in ``prediction`` task

   -  control whether or not LightGBM raises an error when you try to predict on data with a different number of features than the training data

   -  if ``false`` (the default), a fatal error will be raised if the number of features in the dataset you predict on differs from the number seen during training

   -  if ``true``, LightGBM will attempt to predict on whatever data you provide. This is dangerous because you might get incorrect predictions, but you could use it in situations where it is difficult or expensive to generate some features and you are very confident that they were never chosen for splits in the model

   -  **Note**: be very careful setting this parameter to ``true``

772
-  ``convert_model_language`` :raw-html:`<a id="convert_model_language" title="Permalink to this parameter" href="#convert_model_language">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string
773

774
   -  used only in ``convert_model`` task
775

776
   -  only ``cpp`` is supported yet; for conversion model to other languages consider using `m2cgen <https://github.com/BayesWitnesses/m2cgen>`__ utility
777

778
   -  if ``convert_model_language`` is set and ``task=train``, the model will be also converted
779

780
781
   -  **Note**: can be used only in CLI version

782
-  ``convert_model`` :raw-html:`<a id="convert_model" title="Permalink to this parameter" href="#convert_model">&#x1F517;&#xFE0E;</a>`, default = ``gbdt_prediction.cpp``, type = string, aliases: ``convert_model_file``
783

784
   -  used only in ``convert_model`` task
785

786
   -  output filename of converted model
787

788
789
   -  **Note**: can be used only in CLI version

790
791
Objective Parameters
--------------------
792

793
-  ``num_class`` :raw-html:`<a id="num_class" title="Permalink to this parameter" href="#num_class">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_classes``, constraints: ``num_class > 0``
794

795
   -  used only in ``multi-class`` classification application
796

797
-  ``is_unbalance`` :raw-html:`<a id="is_unbalance" title="Permalink to this parameter" href="#is_unbalance">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``unbalance``, ``unbalanced_sets``
798

799
   -  used only in ``binary`` and ``multiclassova`` applications
800

801
   -  set this to ``true`` if training data are unbalanced
802

803
804
   -  **Note**: while enabling this should increase the overall performance metric of your model, it will also result in poor estimates of the individual class probabilities

805
   -  **Note**: this parameter cannot be used at the same time with ``scale_pos_weight``, choose only **one** of them
806

807
-  ``scale_pos_weight`` :raw-html:`<a id="scale_pos_weight" title="Permalink to this parameter" href="#scale_pos_weight">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``scale_pos_weight > 0.0``
808

809
   -  used only in ``binary`` and ``multiclassova`` applications
810

811
   -  weight of labels with positive class
812

813
814
   -  **Note**: while enabling this should increase the overall performance metric of your model, it will also result in poor estimates of the individual class probabilities

815
   -  **Note**: this parameter cannot be used at the same time with ``is_unbalance``, choose only **one** of them
816

817
-  ``sigmoid`` :raw-html:`<a id="sigmoid" title="Permalink to this parameter" href="#sigmoid">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``sigmoid > 0.0``
818

819
   -  used only in ``binary`` and ``multiclassova`` classification and in ``lambdarank`` applications
820

821
   -  parameter for the sigmoid function
822

823
-  ``boost_from_average`` :raw-html:`<a id="boost_from_average" title="Permalink to this parameter" href="#boost_from_average">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
824

825
   -  used only in ``regression``, ``binary``, ``multiclassova`` and ``cross-entropy`` applications
826

827
   -  adjusts initial score to the mean of labels for faster convergence
828

829
-  ``reg_sqrt`` :raw-html:`<a id="reg_sqrt" title="Permalink to this parameter" href="#reg_sqrt">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
830

831
   -  used only in ``regression`` application
832

833
   -  used to fit ``sqrt(label)`` instead of original values and prediction result will be also automatically converted to ``prediction^2``
834

835
   -  might be useful in case of large-range labels
836

837
-  ``alpha`` :raw-html:`<a id="alpha" title="Permalink to this parameter" href="#alpha">&#x1F517;&#xFE0E;</a>`, default = ``0.9``, type = double, constraints: ``alpha > 0.0``
838

839
   -  used only in ``huber`` and ``quantile`` ``regression`` applications
840

841
   -  parameter for `Huber loss <https://en.wikipedia.org/wiki/Huber_loss>`__ and `Quantile regression <https://en.wikipedia.org/wiki/Quantile_regression>`__
842

843
-  ``fair_c`` :raw-html:`<a id="fair_c" title="Permalink to this parameter" href="#fair_c">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``fair_c > 0.0``
844

845
   -  used only in ``fair`` ``regression`` application
846

847
   -  parameter for `Fair loss <https://www.kaggle.com/c/allstate-claims-severity/discussion/24520>`__
848

849
-  ``poisson_max_delta_step`` :raw-html:`<a id="poisson_max_delta_step" title="Permalink to this parameter" href="#poisson_max_delta_step">&#x1F517;&#xFE0E;</a>`, default = ``0.7``, type = double, constraints: ``poisson_max_delta_step > 0.0``
850

851
   -  used only in ``poisson`` ``regression`` application
852

853
854
   -  parameter for `Poisson regression <https://en.wikipedia.org/wiki/Poisson_regression>`__ to safeguard optimization

855
-  ``tweedie_variance_power`` :raw-html:`<a id="tweedie_variance_power" title="Permalink to this parameter" href="#tweedie_variance_power">&#x1F517;&#xFE0E;</a>`, default = ``1.5``, type = double, constraints: ``1.0 <= tweedie_variance_power < 2.0``
856
857
858
859
860
861

   -  used only in ``tweedie`` ``regression`` application

   -  used to control the variance of the tweedie distribution

   -  set this closer to ``2`` to shift towards a **Gamma** distribution
862

863
   -  set this closer to ``1`` to shift towards a **Poisson** distribution
864

865
-  ``max_position`` :raw-html:`<a id="max_position" title="Permalink to this parameter" href="#max_position">&#x1F517;&#xFE0E;</a>`, default = ``20``, type = int, constraints: ``max_position > 0``
866

867
   -  used only in ``lambdarank`` application
868

869
   -  optimizes `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__ at this position
870

871
872
873
874
875
876
877
878
-  ``lambdamart_norm`` :raw-html:`<a id="lambdamart_norm" title="Permalink to this parameter" href="#lambdamart_norm">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool

   -  used only in ``lambdarank`` application

   -  set this to ``true`` to normalize the lambdas for different queries, and improve the performance for unbalanced data

   -  set this to ``false`` to enforce the original lambdamart algorithm

879
-  ``label_gain`` :raw-html:`<a id="label_gain" title="Permalink to this parameter" href="#label_gain">&#x1F517;&#xFE0E;</a>`, default = ``0,1,3,7,15,31,63,...,2^30-1``, type = multi-double
880

881
   -  used only in ``lambdarank`` application
Nikita Titov's avatar
Nikita Titov committed
882

883
   -  relevant gain for labels. For example, the gain of label ``2`` is ``3`` in case of default label gains
Nikita Titov's avatar
Nikita Titov committed
884

885
   -  separate by ``,``
Guolin Ke's avatar
Guolin Ke committed
886

887
888
889
890
-  ``objective_seed`` :raw-html:`<a id="objective_seed" title="Permalink to this parameter" href="#objective_seed">&#x1F517;&#xFE0E;</a>`, default = ``5``, type = int

   -  used only in the ``rank_xendcg`` objective

891
892
   -  random seed for objectives

893
894
895
Metric Parameters
-----------------

896
-  ``metric`` :raw-html:`<a id="metric" title="Permalink to this parameter" href="#metric">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = multi-enum, aliases: ``metrics``, ``metric_types``
897

898
   -  metric(s) to be evaluated on the evaluation set(s)
899

900
      -  ``""`` (empty string or not specified) means that metric corresponding to specified ``objective`` will be used (this is possible only for pre-defined objective functions, otherwise no evaluation metric will be added)
901

902
      -  ``"None"`` (string, **not** a ``None`` value) means that no metric will be registered, aliases: ``na``, ``null``, ``custom``
903
904
905
906
907

      -  ``l1``, absolute loss, aliases: ``mean_absolute_error``, ``mae``, ``regression_l1``

      -  ``l2``, square loss, aliases: ``mean_squared_error``, ``mse``, ``regression_l2``, ``regression``

908
      -  ``rmse``, root square loss, aliases: ``root_mean_squared_error``, ``l2_root``
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925

      -  ``quantile``, `Quantile regression <https://en.wikipedia.org/wiki/Quantile_regression>`__

      -  ``mape``, `MAPE loss <https://en.wikipedia.org/wiki/Mean_absolute_percentage_error>`__, aliases: ``mean_absolute_percentage_error``

      -  ``huber``, `Huber loss <https://en.wikipedia.org/wiki/Huber_loss>`__

      -  ``fair``, `Fair loss <https://www.kaggle.com/c/allstate-claims-severity/discussion/24520>`__

      -  ``poisson``, negative log-likelihood for `Poisson regression <https://en.wikipedia.org/wiki/Poisson_regression>`__

      -  ``gamma``, negative log-likelihood for **Gamma** regression

      -  ``gamma_deviance``, residual deviance for **Gamma** regression

      -  ``tweedie``, negative log-likelihood for **Tweedie** regression

926
      -  ``ndcg``, `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__, aliases: ``lambdarank``, ``rank_xendcg``, ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart``
927
928
929
930
931
932
933

      -  ``map``, `MAP <https://makarandtapaswi.wordpress.com/2012/07/02/intuition-behind-average-precision-and-map/>`__, aliases: ``mean_average_precision``

      -  ``auc``, `AUC <https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve>`__

      -  ``binary_logloss``, `log loss <https://en.wikipedia.org/wiki/Cross_entropy>`__, aliases: ``binary``

Misha Lisovyi's avatar
Misha Lisovyi committed
934
      -  ``binary_error``, for one sample: ``0`` for correct classification, ``1`` for error classification
935

Belinda Trotta's avatar
Belinda Trotta committed
936
937
      -  ``auc_mu``, `AUC-mu <http://proceedings.mlr.press/v97/kleiman19a/kleiman19a.pdf>`__

938
939
940
941
      -  ``multi_logloss``, log loss for multi-class classification, aliases: ``multiclass``, ``softmax``, ``multiclassova``, ``multiclass_ova``, ``ova``, ``ovr``

      -  ``multi_error``, error rate for multi-class classification

Guolin Ke's avatar
Guolin Ke committed
942
      -  ``cross_entropy``, cross-entropy (with optional linear weights), aliases: ``xentropy``
943

Guolin Ke's avatar
Guolin Ke committed
944
      -  ``cross_entropy_lambda``, "intensity-weighted" cross-entropy, aliases: ``xentlambda``
945

Guolin Ke's avatar
Guolin Ke committed
946
      -  ``kullback_leibler``, `Kullback-Leibler divergence <https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence>`__, aliases: ``kldiv``
947

Misha Lisovyi's avatar
Misha Lisovyi committed
948
   -  support multiple metrics, separated by ``,``
949

950
-  ``metric_freq`` :raw-html:`<a id="metric_freq" title="Permalink to this parameter" href="#metric_freq">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``output_freq``, constraints: ``metric_freq > 0``
951
952
953

   -  frequency for metric output

954
-  ``is_provide_training_metric`` :raw-html:`<a id="is_provide_training_metric" title="Permalink to this parameter" href="#is_provide_training_metric">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``training_metric``, ``is_training_metric``, ``train_metric``
955

956
   -  set this to ``true`` to output metric result over training dataset
957

958
959
   -  **Note**: can be used only in CLI version

960
-  ``eval_at`` :raw-html:`<a id="eval_at" title="Permalink to this parameter" href="#eval_at">&#x1F517;&#xFE0E;</a>`, default = ``1,2,3,4,5``, type = multi-int, aliases: ``ndcg_eval_at``, ``ndcg_at``, ``map_eval_at``, ``map_at``
961

962
963
   -  used only with ``ndcg`` and ``map`` metrics

964
   -  `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__ and `MAP <https://makarandtapaswi.wordpress.com/2012/07/02/intuition-behind-average-precision-and-map/>`__ evaluation positions, separated by ``,``
965

Belinda Trotta's avatar
Belinda Trotta committed
966
967
968
969
970
971
972
973
974
975
976
977
-  ``multi_error_top_k`` :raw-html:`<a id="multi_error_top_k" title="Permalink to this parameter" href="#multi_error_top_k">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, constraints: ``multi_error_top_k > 0``

   -  used only with ``multi_error`` metric

   -  threshold for top-k multi-error metric

   -  the error on each sample is ``0`` if the true class is among the top ``multi_error_top_k`` predictions, and ``1`` otherwise

      -  more precisely, the error on a sample is ``0`` if there are at least ``num_classes - multi_error_top_k`` predictions strictly less than the prediction on the true class

   -  when ``multi_error_top_k=1`` this is equivalent to the usual multi-error metric

Belinda Trotta's avatar
Belinda Trotta committed
978
979
980
981
982
983
984
985
986
987
988
989
-  ``auc_mu_weights`` :raw-html:`<a id="auc_mu_weights" title="Permalink to this parameter" href="#auc_mu_weights">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-double

   -  used only with ``auc_mu`` metric

   -  list representing flattened matrix (in row-major order) giving loss weights for classification errors

   -  list should have ``n * n`` elements, where ``n`` is the number of classes

   -  the matrix co-ordinate ``[i, j]`` should correspond to the ``i * n + j``-th element of the list

   -  if not specified, will use equal weights for all classes

990
991
992
Network Parameters
------------------

993
-  ``num_machines`` :raw-html:`<a id="num_machines" title="Permalink to this parameter" href="#num_machines">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_machine``, constraints: ``num_machines > 0``
994

995
   -  the number of machines for parallel learning application
996

997
   -  this parameter is needed to be set in both **socket** and **mpi** versions
998

999
-  ``local_listen_port`` :raw-html:`<a id="local_listen_port" title="Permalink to this parameter" href="#local_listen_port">&#x1F517;&#xFE0E;</a>`, default = ``12400``, type = int, aliases: ``local_port``, ``port``, constraints: ``local_listen_port > 0``
1000
1001
1002

   -  TCP listen port for local machines

1003
   -  **Note**: don't forget to allow this port in firewall settings before training
1004

1005
-  ``time_out`` :raw-html:`<a id="time_out" title="Permalink to this parameter" href="#time_out">&#x1F517;&#xFE0E;</a>`, default = ``120``, type = int, constraints: ``time_out > 0``
1006
1007
1008

   -  socket time-out in minutes

1009
-  ``machine_list_filename`` :raw-html:`<a id="machine_list_filename" title="Permalink to this parameter" href="#machine_list_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``machine_list_file``, ``machine_list``, ``mlist``
1010
1011

   -  path of file that lists machines for this parallel learning application
1012

1013
   -  each line contains one IP and one port for one machine. The format is ``ip port`` (space as a separator)
1014

1015
-  ``machines`` :raw-html:`<a id="machines" title="Permalink to this parameter" href="#machines">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``workers``, ``nodes``
1016
1017

   -  list of machines in the following format: ``ip1:port1,ip2:port2``
1018
1019
1020
1021

GPU Parameters
--------------

1022
-  ``gpu_platform_id`` :raw-html:`<a id="gpu_platform_id" title="Permalink to this parameter" href="#gpu_platform_id">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int
1023

1024
   -  OpenCL platform ID. Usually each GPU vendor exposes one OpenCL platform
1025

1026
   -  ``-1`` means the system-wide default platform
1027

1028
1029
   -  **Note**: refer to `GPU Targets <./GPU-Targets.rst#query-opencl-devices-in-your-system>`__ for more details

1030
-  ``gpu_device_id`` :raw-html:`<a id="gpu_device_id" title="Permalink to this parameter" href="#gpu_device_id">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int
1031
1032
1033

   -  OpenCL device ID in the specified platform. Each GPU in the selected platform has a unique device ID

1034
   -  ``-1`` means the default device in the selected platform
1035

1036
1037
   -  **Note**: refer to `GPU Targets <./GPU-Targets.rst#query-opencl-devices-in-your-system>`__ for more details

1038
-  ``gpu_use_dp`` :raw-html:`<a id="gpu_use_dp" title="Permalink to this parameter" href="#gpu_use_dp">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
1039

1040
   -  set this to ``true`` to use double precision math on GPU (by default single precision is used)
1041

1042
1043
.. end params list

1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
Others
------

Continued Training with Input Score
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

LightGBM supports continued training with initial scores. It uses an additional file to store these initial scores, like the following:

::

    0.5
    -0.1
    0.9
    ...

It means the initial score of the first data row is ``0.5``, second is ``-0.1``, and so on.
The initial score file corresponds with data file line by line, and has per score per line.
1061

1062
And if the name of data file is ``train.txt``, the initial score file should be named as ``train.txt.init`` and in the same folder as the data file.
1063
In this case, LightGBM will auto load initial score file if it exists.
1064

1065
1066
Otherwise, you should specify the path to the custom named file with initial scores by the ``initscore_filename`` `parameter <#initscore_filename>`__.

1067
1068
1069
Weight Data
~~~~~~~~~~~

Nikita Titov's avatar
Nikita Titov committed
1070
LightGBM supports weighted training. It uses an additional file to store weight data, like the following:
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080

::

    1.0
    0.5
    0.8
    ...

It means the weight of the first data row is ``1.0``, second is ``0.5``, and so on.
The weight file corresponds with data file line by line, and has per weight per line.
1081

1082
And if the name of data file is ``train.txt``, the weight file should be named as ``train.txt.weight`` and placed in the same folder as the data file.
1083
In this case, LightGBM will load the weight file automatically if it exists.
1084

1085
Also, you can include weight column in your data file. Please refer to the ``weight_column`` `parameter <#weight_column>`__ in above.
1086
1087
1088
1089

Query Data
~~~~~~~~~~

1090
For learning to rank, it needs query information for training data.
Nikita Titov's avatar
Nikita Titov committed
1091
LightGBM uses an additional file to store query data, like the following:
1092
1093
1094
1095
1096
1097
1098
1099

::

    27
    18
    67
    ...

1100
It means first ``27`` lines samples belong to one query and next ``18`` lines belong to another, and so on.
1101
1102
1103

**Note**: data should be ordered by the query.

1104
If the name of data file is ``train.txt``, the query file should be named as ``train.txt.query`` and placed in the same folder as the data file.
1105
In this case, LightGBM will load the query file automatically if it exists.
1106

1107
Also, you can include query/group id column in your data file. Please refer to the ``group_column`` `parameter <#group_column>`__ in above.
1108
1109

.. _Laurae++ Interactive Documentation: https://sites.google.com/view/lauraepp/parameters