Parameters.rst 78.9 KB
Newer Older
1
..  List of parameters is auto generated by LightGBM\.ci\parameter-generator.py from LightGBM\include\LightGBM\config.h file.
2

3
4
5
.. role:: raw-html(raw)
    :format: html

6
7
8
Parameters
==========

9
This page contains descriptions of all parameters in LightGBM.
10
11
12
13
14
15
16
17
18
19

**List of other helpful links**

- `Python API <./Python-API.rst>`__

- `Parameters Tuning <./Parameters-Tuning.rst>`__

Parameters Format
-----------------

20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
Parameters are merged together in the following order (later items overwrite earlier ones):

1. LightGBM's default values
2. special files for ``weight``, ``init_score``, ``query``, and ``positions`` (see `Others <#others>`__)
3. (CLI only) configuration in a file passed like ``config=train.conf``
4. (CLI only) configuration passed via the command line
5. (Python, R) special keyword arguments to some functions (e.g. ``num_boost_round`` in ``train()``)
6. (Python, R) ``params`` function argument (including ``**kwargs`` in Python and ``...`` in R)
7. (C API) ``parameters`` or ``params`` function argument

Many parameters have "aliases", alternative names which refer to the same configuration.

Where a mix of the primary parameter name and aliases are given, the primary parameter name is always preferred to any aliases.

For example, in Python:

.. code-block:: python

   # use learning rate of 0.07, becase 'learning_rate'
   # is the primary parameter name
   lgb.train(
      params={
         "learning_rate": 0.07,
         "shrinkage_rate": 0.12
      },
      train_set=dtrain
   )

Where multiple aliases are given, and the primary parameter name is not, the first alias
appearing in the lists returned by ``Config::parameter2aliases()`` in the C++ library is used.
Those lists are hard-coded in a fairly arbitrary way... wherever possible, avoid relying on this behavior.

For example, in Python:

.. code-block:: python

   # use learning rate of 0.12, LightGBM has a hard-coded preference for 'shrinkage_rate'
   # over any other aliases, and 'learning_rate' is not provided
   lgb.train(
      params={
         "eta": 0.19,
         "shrinkage_rate": 0.12
      },
      train_set=dtrain
   )

**CLI**

68
The parameters format is ``key1=value1 key2=value2 ...``.
69
Parameters can be set both in config file and command line.
70
71
72
By using command line, parameters should not have spaces before and after ``=``.
By using config files, one line can only contain one parameter. You can use ``#`` to comment.

73
74
**Python**

75
76
Any parameters that accept multiple values should be passed as a Python list.

77
78
79
80
81
82
83
84
85
.. code-block:: python

   params = {
      "monotone_constraints": [-1, 0, 1]
   }


**R**

86
87
Any parameters that accept multiple values should be passed as an R list.

88
89
90
91
92
93
.. code-block:: r

   params <- list(
      monotone_constraints = c(-1, 0, 1)
   )

94
95
.. start params list

96
97
98
Core Parameters
---------------

99
-  ``config`` :raw-html:`<a id="config" title="Permalink to this parameter" href="#config">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``config_file``
100
101
102

   -  path of config file

103
   -  **Note**: can be used only in CLI version
104

105
-  ``task`` :raw-html:`<a id="task" title="Permalink to this parameter" href="#task">&#x1F517;&#xFE0E;</a>`, default = ``train``, type = enum, options: ``train``, ``predict``, ``convert_model``, ``refit``, aliases: ``task_type``
106

107
   -  ``train``, for training, aliases: ``training``
108

109
   -  ``predict``, for prediction, aliases: ``prediction``, ``test``
110

Nikita Titov's avatar
Nikita Titov committed
111
   -  ``convert_model``, for converting model file into if-else format, see more information in `Convert Parameters <#convert-parameters>`__
112

113
   -  ``refit``, for refitting existing models with new data, aliases: ``refit_tree``
114

115
116
   -  ``save_binary``, load train (and validation) data then save dataset to binary file. Typical usage: ``save_binary`` first, then run multiple ``train`` tasks in parallel using the saved binary file

Guolin Ke's avatar
Guolin Ke committed
117
   -  **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent functions
118

119
-  ``objective`` :raw-html:`<a id="objective" title="Permalink to this parameter" href="#objective">&#x1F517;&#xFE0E;</a>`, default = ``regression``, type = enum, options: ``regression``, ``regression_l1``, ``huber``, ``fair``, ``poisson``, ``quantile``, ``mape``, ``gamma``, ``tweedie``, ``binary``, ``multiclass``, ``multiclassova``, ``cross_entropy``, ``cross_entropy_lambda``, ``lambdarank``, ``rank_xendcg``, aliases: ``objective_type``, ``app``, ``application``, ``loss``
120

121
   -  regression application
122

Guolin Ke's avatar
Guolin Ke committed
123
      -  ``regression``, L2 loss, aliases: ``regression_l2``, ``l2``, ``mean_squared_error``, ``mse``, ``l2_root``, ``root_mean_squared_error``, ``rmse``
124

Guolin Ke's avatar
Guolin Ke committed
125
      -  ``regression_l1``, L1 loss, aliases: ``l1``, ``mean_absolute_error``, ``mae``
126

127
      -  ``huber``, `Huber loss <https://en.wikipedia.org/wiki/Huber_loss>`__
128

129
      -  ``fair``, `Fair loss <https://www.kaggle.com/c/allstate-claims-severity/discussion/24520>`__
130

131
      -  ``poisson``, `Poisson regression <https://en.wikipedia.org/wiki/Poisson_regression>`__
132

133
      -  ``quantile``, `Quantile regression <https://en.wikipedia.org/wiki/Quantile_regression>`__
134

135
      -  ``mape``, `MAPE loss <https://en.wikipedia.org/wiki/Mean_absolute_percentage_error>`__, aliases: ``mean_absolute_percentage_error``
136

137
      -  ``gamma``, Gamma regression with log-link. It might be useful, e.g., for modeling insurance claims severity, or for any target that might be `gamma-distributed <https://en.wikipedia.org/wiki/Gamma_distribution#Occurrence_and_applications>`__
Guolin Ke's avatar
Guolin Ke committed
138

139
      -  ``tweedie``, Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any target that might be `tweedie-distributed <https://en.wikipedia.org/wiki/Tweedie_distribution#Occurrence_and_applications>`__
Guolin Ke's avatar
Guolin Ke committed
140

141
142
143
144
145
   -  binary classification application

      -  ``binary``, binary `log loss <https://en.wikipedia.org/wiki/Cross_entropy>`__ classification (or logistic regression)

      -  requires labels in {0, 1}; see ``cross-entropy`` application for general probability labels in [0, 1]
146
147
148

   -  multi-class classification application

149
      -  ``multiclass``, `softmax <https://en.wikipedia.org/wiki/Softmax_function>`__ objective function, aliases: ``softmax``
150

151
      -  ``multiclassova``, `One-vs-All <https://en.wikipedia.org/wiki/Multiclass_classification#One-vs.-rest>`__ binary objective function, aliases: ``multiclass_ova``, ``ova``, ``ovr``
Nikita Titov's avatar
Nikita Titov committed
152
153

      -  ``num_class`` should be set as well
154
155
156

   -  cross-entropy application

Guolin Ke's avatar
Guolin Ke committed
157
      -  ``cross_entropy``, objective function for cross-entropy (with optional linear weights), aliases: ``xentropy``
158

Guolin Ke's avatar
Guolin Ke committed
159
      -  ``cross_entropy_lambda``, alternative parameterization of cross-entropy, aliases: ``xentlambda``
160

161
      -  label is anything in interval [0, 1]
162

163
   -  ranking application
164

165
      -  ``lambdarank``, `lambdarank <https://proceedings.neurips.cc/paper_files/paper/2006/file/af44c4c56f385c43f2529f9b1b018f6a-Paper.pdf>`__ objective. `label_gain <#label_gain>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain``
166

167
      -  ``rank_xendcg``, `XE_NDCG_MART <https://arxiv.org/abs/1911.09798>`__ ranking objective function, aliases: ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart``
168

169
      -  ``rank_xendcg`` is faster than and achieves the similar performance as ``lambdarank``
170

171
      -  label should be ``int`` type, and larger number represents the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect)
172

173
174
175
176
177
178
   -  custom objective function (gradients and hessians not computed directly by LightGBM)

      -  ``custom``

      -  must be passed through parameters explicitly in the C API

179
180
      -  **Note**: cannot be used in CLI version

181
-  ``boosting`` :raw-html:`<a id="boosting" title="Permalink to this parameter" href="#boosting">&#x1F517;&#xFE0E;</a>`, default = ``gbdt``, type = enum, options: ``gbdt``, ``rf``, ``dart``, aliases: ``boosting_type``, ``boost``
182

183
   -  ``gbdt``, traditional Gradient Boosting Decision Tree, aliases: ``gbrt``
184

185
   -  ``rf``, Random Forest, aliases: ``random_forest``
186

187
   -  ``dart``, `Dropouts meet Multiple Additive Regression Trees <https://arxiv.org/abs/1505.01866>`__
188

Nikita Titov's avatar
Nikita Titov committed
189
190
      -  **Note**: internally, LightGBM uses ``gbdt`` mode for the first ``1 / learning_rate`` iterations

191
192
193
194
195
196
197
198
-  ``data_sample_strategy`` :raw-html:`<a id="data_sample_strategy" title="Permalink to this parameter" href="#data_sample_strategy">&#x1F517;&#xFE0E;</a>`, default = ``bagging``, type = enum, options: ``bagging``, ``goss``

   -  ``bagging``, Randomly Bagging Sampling

      -  **Note**: ``bagging`` is only effective when ``bagging_freq > 0`` and ``bagging_fraction < 1.0``

   -  ``goss``, Gradient-based One-Side Sampling

James Lamb's avatar
James Lamb committed
199
   -  *New in version 4.0.0*
200

201
-  ``data`` :raw-html:`<a id="data" title="Permalink to this parameter" href="#data">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``train``, ``train_data``, ``train_data_file``, ``data_filename``
202

203
   -  path of training data, LightGBM will train from this data
204

205
206
   -  **Note**: can be used only in CLI version

207
-  ``valid`` :raw-html:`<a id="valid" title="Permalink to this parameter" href="#valid">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``test``, ``valid_data``, ``valid_data_file``, ``test_data``, ``test_data_file``, ``valid_filenames``
208

209
   -  path(s) of validation/test data, LightGBM will output metrics for these data
210

211
   -  support multiple validation data, separated by ``,``
212

213
214
   -  **Note**: can be used only in CLI version

215
-  ``num_iterations`` :raw-html:`<a id="num_iterations" title="Permalink to this parameter" href="#num_iterations">&#x1F517;&#xFE0E;</a>`, default = ``100``, type = int, aliases: ``num_iteration``, ``n_iter``, ``num_tree``, ``num_trees``, ``num_round``, ``num_rounds``, ``nrounds``, ``num_boost_round``, ``n_estimators``, ``max_iter``, constraints: ``num_iterations >= 0``
216
217

   -  number of boosting iterations
218

219
   -  **Note**: internally, LightGBM constructs ``num_class * num_iterations`` trees for multi-class classification problems
220

221
-  ``learning_rate`` :raw-html:`<a id="learning_rate" title="Permalink to this parameter" href="#learning_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.1``, type = double, aliases: ``shrinkage_rate``, ``eta``, constraints: ``learning_rate > 0.0``
222
223
224
225
226

   -  shrinkage rate

   -  in ``dart``, it also affects on normalization weights of dropped trees

227
-  ``num_leaves`` :raw-html:`<a id="num_leaves" title="Permalink to this parameter" href="#num_leaves">&#x1F517;&#xFE0E;</a>`, default = ``31``, type = int, aliases: ``num_leaf``, ``max_leaves``, ``max_leaf``, ``max_leaf_nodes``, constraints: ``1 < num_leaves <= 131072``
228

229
   -  max number of leaves in one tree
230

231
-  ``tree_learner`` :raw-html:`<a id="tree_learner" title="Permalink to this parameter" href="#tree_learner">&#x1F517;&#xFE0E;</a>`, default = ``serial``, type = enum, options: ``serial``, ``feature``, ``data``, ``voting``, aliases: ``tree``, ``tree_type``, ``tree_learner_type``
232
233
234

   -  ``serial``, single machine tree learner

235
   -  ``feature``, feature parallel tree learner, aliases: ``feature_parallel``
236

237
   -  ``data``, data parallel tree learner, aliases: ``data_parallel``
238

239
   -  ``voting``, voting parallel tree learner, aliases: ``voting_parallel``
240

241
   -  refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details
242

243
-  ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs``
244

245
246
   -  used only in ``train``, ``prediction`` and ``refit`` tasks or in correspondent functions of language-specific packages

247
248
   -  number of threads for LightGBM

249
   -  ``0`` means default number of threads in OpenMP
250

251
   -  for the best speed, set this to the number of **real CPU cores**, not the number of threads (most CPUs use `hyper-threading <https://en.wikipedia.org/wiki/Hyper-threading>`__ to generate 2 threads per CPU core)
252

253
   -  do not set it too large if your dataset is small (for instance, do not use 64 threads for a dataset with 10,000 rows)
254

255
   -  be aware a task manager or any similar CPU monitoring tool might report that cores not being fully utilized. **This is normal**
256

257
   -  for distributed learning, do not use all CPU cores because this will cause poor performance for the network communication
258

259
260
   -  **Note**: please **don't** change this during training, especially when running multiple jobs simultaneously by external packages, otherwise it may cause undesirable errors

261
-  ``device_type`` :raw-html:`<a id="device_type" title="Permalink to this parameter" href="#device_type">&#x1F517;&#xFE0E;</a>`, default = ``cpu``, type = enum, options: ``cpu``, ``gpu``, ``cuda``, aliases: ``device``
262

263
264
265
266
267
268
269
   -  device for the tree learning

   -  ``cpu`` supports all LightGBM functionality and is portable across the widest range of operating systems and hardware

   -  ``cuda`` offers faster training than ``gpu`` or ``cpu``, but only works on GPUs supporting CUDA

   -  ``gpu`` can be faster than ``cpu`` and works on a wider range of GPUs than CUDA
270
271
272

   -  **Note**: it is recommended to use the smaller ``max_bin`` (e.g. 63) to get the better speed up

273
274
   -  **Note**: for the faster speed, GPU uses 32-bit float point to sum up by default, so this may affect the accuracy for some tasks. You can set ``gpu_use_dp=true`` to enable 64-bit float point, but it will slow down the training

275
   -  **Note**: refer to `Installation Guide <./Installation-Guide.rst>`__ to build LightGBM with GPU or CUDA support
276

277
-  ``seed`` :raw-html:`<a id="seed" title="Permalink to this parameter" href="#seed">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = int, aliases: ``random_seed``, ``random_state``
278

279
   -  this seed is used to generate other seeds, e.g. ``data_random_seed``, ``feature_fraction_seed``, etc.
280

281
282
283
   -  by default, this seed is unused in favor of default values of other seeds

   -  this seed has lower priority in comparison with other seeds, which means that it will be overridden, if you set other seeds explicitly
284

Guolin Ke's avatar
Guolin Ke committed
285
286
287
288
289
290
291
292
293
294
295
296
-  ``deterministic`` :raw-html:`<a id="deterministic" title="Permalink to this parameter" href="#deterministic">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  used only with ``cpu`` device type

   -  setting this to ``true`` should ensure the stable results when using the same data and the same parameters (and different ``num_threads``)

   -  when you use the different seeds, different LightGBM versions, the binaries compiled by different compilers, or in different systems, the results are expected to be different

   -  you can `raise issues <https://github.com/microsoft/LightGBM/issues>`__ in LightGBM GitHub repo when you meet the unstable results

   -  **Note**: setting this to ``true`` may slow down the training

297
298
   -  **Note**: to avoid potential instability due to numerical issues, please set ``force_col_wise=true`` or ``force_row_wise=true`` when setting ``deterministic=true``

299
300
301
Learning Control Parameters
---------------------------

302
303
-  ``force_col_wise`` :raw-html:`<a id="force_col_wise" title="Permalink to this parameter" href="#force_col_wise">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

304
305
306
   -  used only with ``cpu`` device type

   -  set this to ``true`` to force col-wise histogram building
307

308
   -  enabling this is recommended when:
309

310
      -  the number of columns is large, or the total number of bins is large
311

Nikita Titov's avatar
Nikita Titov committed
312
      -  ``num_threads`` is large, e.g. ``> 20``
313

314
      -  you want to reduce memory cost
315

316
317
318
   -  **Note**: when both ``force_col_wise`` and ``force_row_wise`` are ``false``, LightGBM will firstly try them both, and then use the faster one. To remove the overhead of testing set the faster one to ``true`` manually

   -  **Note**: this parameter cannot be used at the same time with ``force_row_wise``, choose only one of them
319
320
321

-  ``force_row_wise`` :raw-html:`<a id="force_row_wise" title="Permalink to this parameter" href="#force_row_wise">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

322
323
324
325
326
   -  used only with ``cpu`` device type

   -  set this to ``true`` to force row-wise histogram building

   -  enabling this is recommended when:
327

328
      -  the number of data points is large, and the total number of bins is relatively small
329

Nikita Titov's avatar
Nikita Titov committed
330
      -  ``num_threads`` is relatively small, e.g. ``<= 16``
331

332
      -  you want to use small ``bagging_fraction`` or ``goss`` sample strategy to speed up
333

334
   -  **Note**: setting this to ``true`` will double the memory cost for Dataset object. If you have not enough memory, you can try setting ``force_col_wise=true``
335

336
   -  **Note**: when both ``force_col_wise`` and ``force_row_wise`` are ``false``, LightGBM will firstly try them both, and then use the faster one. To remove the overhead of testing set the faster one to ``true`` manually
337

338
   -  **Note**: this parameter cannot be used at the same time with ``force_col_wise``, choose only one of them
339

340
341
342
343
344
345
-  ``histogram_pool_size`` :raw-html:`<a id="histogram_pool_size" title="Permalink to this parameter" href="#histogram_pool_size">&#x1F517;&#xFE0E;</a>`, default = ``-1.0``, type = double, aliases: ``hist_pool_size``

   -  max cache size in MB for historical histogram

   -  ``< 0`` means no limit

346
-  ``max_depth`` :raw-html:`<a id="max_depth" title="Permalink to this parameter" href="#max_depth">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int
347

348
   -  limit the max depth for tree model. This is used to deal with over-fitting when ``#data`` is small. Tree still grows leaf-wise
349

350
   -  ``<= 0`` means no limit
351

352
-  ``min_data_in_leaf`` :raw-html:`<a id="min_data_in_leaf" title="Permalink to this parameter" href="#min_data_in_leaf">&#x1F517;&#xFE0E;</a>`, default = ``20``, type = int, aliases: ``min_data_per_leaf``, ``min_data``, ``min_child_samples``, ``min_samples_leaf``, constraints: ``min_data_in_leaf >= 0``
353
354
355

   -  minimal number of data in one leaf. Can be used to deal with over-fitting

356
357
   -  **Note**: this is an approximation based on the Hessian, so occasionally you may observe splits which produce leaf nodes that have less than this many observations

358
-  ``min_sum_hessian_in_leaf`` :raw-html:`<a id="min_sum_hessian_in_leaf" title="Permalink to this parameter" href="#min_sum_hessian_in_leaf">&#x1F517;&#xFE0E;</a>`, default = ``1e-3``, type = double, aliases: ``min_sum_hessian_per_leaf``, ``min_sum_hessian``, ``min_hessian``, ``min_child_weight``, constraints: ``min_sum_hessian_in_leaf >= 0.0``
359
360
361

   -  minimal sum hessian in one leaf. Like ``min_data_in_leaf``, it can be used to deal with over-fitting

362
-  ``bagging_fraction`` :raw-html:`<a id="bagging_fraction" title="Permalink to this parameter" href="#bagging_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``sub_row``, ``subsample``, ``bagging``, constraints: ``0.0 < bagging_fraction <= 1.0``
363

364
   -  like ``feature_fraction``, but this will randomly select part of data without resampling
365
366
367
368
369

   -  can be used to speed up training

   -  can be used to deal with over-fitting

370
   -  **Note**: to enable bagging, ``bagging_freq`` should be set to a non zero value as well
371

Guolin Ke's avatar
Guolin Ke committed
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
-  ``pos_bagging_fraction`` :raw-html:`<a id="pos_bagging_fraction" title="Permalink to this parameter" href="#pos_bagging_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``pos_sub_row``, ``pos_subsample``, ``pos_bagging``, constraints: ``0.0 < pos_bagging_fraction <= 1.0``

   -  used only in ``binary`` application

   -  used for imbalanced binary classification problem, will randomly sample ``#pos_samples * pos_bagging_fraction`` positive samples in bagging

   -  should be used together with ``neg_bagging_fraction``

   -  set this to ``1.0`` to disable

   -  **Note**: to enable this, you need to set ``bagging_freq`` and ``neg_bagging_fraction`` as well

   -  **Note**: if both ``pos_bagging_fraction`` and ``neg_bagging_fraction`` are set to ``1.0``,  balanced bagging is disabled

   -  **Note**: if balanced bagging is enabled, ``bagging_fraction`` will be ignored

-  ``neg_bagging_fraction`` :raw-html:`<a id="neg_bagging_fraction" title="Permalink to this parameter" href="#neg_bagging_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``neg_sub_row``, ``neg_subsample``, ``neg_bagging``, constraints: ``0.0 < neg_bagging_fraction <= 1.0``

   -  used only in ``binary`` application

   -  used for imbalanced binary classification problem, will randomly sample ``#neg_samples * neg_bagging_fraction`` negative samples in bagging

   -  should be used together with ``pos_bagging_fraction``

   -  set this to ``1.0`` to disable

   -  **Note**: to enable this, you need to set ``bagging_freq`` and ``pos_bagging_fraction`` as well

   -  **Note**: if both ``pos_bagging_fraction`` and ``neg_bagging_fraction`` are set to ``1.0``,  balanced bagging is disabled

   -  **Note**: if balanced bagging is enabled, ``bagging_fraction`` will be ignored

404
-  ``bagging_freq`` :raw-html:`<a id="bagging_freq" title="Permalink to this parameter" href="#bagging_freq">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``subsample_freq``
405

406
   -  frequency for bagging
407

408
   -  ``0`` means disable bagging; ``k`` means perform bagging at every ``k`` iteration. Every ``k``-th iteration, LightGBM will randomly select ``bagging_fraction * 100%`` of the data to use for the next ``k`` iterations
409

410
   -  **Note**: bagging is only effective when ``0.0 < bagging_fraction < 1.0``
411

412
-  ``bagging_seed`` :raw-html:`<a id="bagging_seed" title="Permalink to this parameter" href="#bagging_seed">&#x1F517;&#xFE0E;</a>`, default = ``3``, type = int, aliases: ``bagging_fraction_seed``
413
414
415

   -  random seed for bagging

416
-  ``feature_fraction`` :raw-html:`<a id="feature_fraction" title="Permalink to this parameter" href="#feature_fraction">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``sub_feature``, ``colsample_bytree``, constraints: ``0.0 < feature_fraction <= 1.0``
417

418
   -  LightGBM will randomly select a subset of features on each iteration (tree) if ``feature_fraction`` is smaller than ``1.0``. For example, if you set it to ``0.8``, LightGBM will select 80% of features before training each tree
419

420
   -  can be used to speed up training
421

422
   -  can be used to deal with over-fitting
423

424
-  ``feature_fraction_bynode`` :raw-html:`<a id="feature_fraction_bynode" title="Permalink to this parameter" href="#feature_fraction_bynode">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, aliases: ``sub_feature_bynode``, ``colsample_bynode``, constraints: ``0.0 < feature_fraction_bynode <= 1.0``
425

426
   -  LightGBM will randomly select a subset of features on each tree node if ``feature_fraction_bynode`` is smaller than ``1.0``. For example, if you set it to ``0.8``, LightGBM will select 80% of features at each tree node
427
428
429

   -  can be used to deal with over-fitting

430
431
432
433
   -  **Note**: unlike ``feature_fraction``, this cannot speed up training

   -  **Note**: if both ``feature_fraction`` and ``feature_fraction_bynode`` are smaller than ``1.0``, the final fraction of each node is ``feature_fraction * feature_fraction_bynode``

434
-  ``feature_fraction_seed`` :raw-html:`<a id="feature_fraction_seed" title="Permalink to this parameter" href="#feature_fraction_seed">&#x1F517;&#xFE0E;</a>`, default = ``2``, type = int
435
436

   -  random seed for ``feature_fraction``
437

Nikita Titov's avatar
Nikita Titov committed
438
-  ``extra_trees`` :raw-html:`<a id="extra_trees" title="Permalink to this parameter" href="#extra_trees">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``extra_tree``
439
440
441
442
443

   -  use extremely randomized trees

   -  if set to ``true``, when evaluating node splits LightGBM will check only one randomly-chosen threshold for each feature

444
445
   -  can be used to speed up training

446
447
448
449
450
451
   -  can be used to deal with over-fitting

-  ``extra_seed`` :raw-html:`<a id="extra_seed" title="Permalink to this parameter" href="#extra_seed">&#x1F517;&#xFE0E;</a>`, default = ``6``, type = int

   -  random seed for selecting thresholds when ``extra_trees`` is true

452
-  ``early_stopping_round`` :raw-html:`<a id="early_stopping_round" title="Permalink to this parameter" href="#early_stopping_round">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``early_stopping_rounds``, ``early_stopping``, ``n_iter_no_change``
453

454
   -  will stop training if one metric of one validation data doesn't improve in last ``early_stopping_round`` rounds
455

456
   -  ``<= 0`` means disable
457

458
459
   -  can be used to speed up training

460
461
462
463
-  ``early_stopping_min_delta`` :raw-html:`<a id="early_stopping_min_delta" title="Permalink to this parameter" href="#early_stopping_min_delta">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, constraints: ``early_stopping_min_delta >= 0.0``

   -  when early stopping is used (i.e. ``early_stopping_round > 0``), require the early stopping metric to improve by at least this delta to be considered an improvement

James Lamb's avatar
James Lamb committed
464
   -  *New in version 4.4.0*
James Lamb's avatar
James Lamb committed
465

466
467
-  ``first_metric_only`` :raw-html:`<a id="first_metric_only" title="Permalink to this parameter" href="#first_metric_only">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

468
   -  LightGBM allows you to provide multiple evaluation metrics. Set this to ``true``, if you want to use only the first metric for early stopping
469

470
-  ``max_delta_step`` :raw-html:`<a id="max_delta_step" title="Permalink to this parameter" href="#max_delta_step">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``max_tree_output``, ``max_leaf_output``
471

472
   -  used to limit the max output of tree leaves
473

474
   -  ``<= 0`` means no constraint
475

476
   -  the final max output of leaves is ``learning_rate * max_delta_step``
477

478
-  ``lambda_l1`` :raw-html:`<a id="lambda_l1" title="Permalink to this parameter" href="#lambda_l1">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``reg_alpha``, ``l1_regularization``, constraints: ``lambda_l1 >= 0.0``
479
480
481

   -  L1 regularization

482
-  ``lambda_l2`` :raw-html:`<a id="lambda_l2" title="Permalink to this parameter" href="#lambda_l2">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``reg_lambda``, ``lambda``, ``l2_regularization``, constraints: ``lambda_l2 >= 0.0``
483
484
485

   -  L2 regularization

486
487
-  ``linear_lambda`` :raw-html:`<a id="linear_lambda" title="Permalink to this parameter" href="#linear_lambda">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, constraints: ``linear_lambda >= 0.0``

488
   -  linear tree regularization, corresponds to the parameter ``lambda`` in Eq. 3 of `Gradient Boosting with Piece-Wise Linear Regression Trees <https://arxiv.org/pdf/1802.05640.pdf>`__
489

490
-  ``min_gain_to_split`` :raw-html:`<a id="min_gain_to_split" title="Permalink to this parameter" href="#min_gain_to_split">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``min_split_gain``, constraints: ``min_gain_to_split >= 0.0``
491

492
   -  the minimal gain to perform split
493

494
495
   -  can be used to speed up training

496
-  ``drop_rate`` :raw-html:`<a id="drop_rate" title="Permalink to this parameter" href="#drop_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.1``, type = double, aliases: ``rate_drop``, constraints: ``0.0 <= drop_rate <= 1.0``
497

498
   -  used only in ``dart``
499

500
   -  dropout rate: a fraction of previous trees to drop during the dropout
501

502
-  ``max_drop`` :raw-html:`<a id="max_drop" title="Permalink to this parameter" href="#max_drop">&#x1F517;&#xFE0E;</a>`, default = ``50``, type = int
503

504
   -  used only in ``dart``
505

506
   -  max number of dropped trees during one boosting iteration
507

508
   -  ``<=0`` means no limit
509

510
-  ``skip_drop`` :raw-html:`<a id="skip_drop" title="Permalink to this parameter" href="#skip_drop">&#x1F517;&#xFE0E;</a>`, default = ``0.5``, type = double, constraints: ``0.0 <= skip_drop <= 1.0``
511

512
   -  used only in ``dart``
513

514
   -  probability of skipping the dropout procedure during a boosting iteration
515

516
-  ``xgboost_dart_mode`` :raw-html:`<a id="xgboost_dart_mode" title="Permalink to this parameter" href="#xgboost_dart_mode">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
517

518
   -  used only in ``dart``
519

520
   -  set this to ``true``, if you want to use XGBoost DART mode
521

522
-  ``uniform_drop`` :raw-html:`<a id="uniform_drop" title="Permalink to this parameter" href="#uniform_drop">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
523

524
   -  used only in ``dart``
525

526
   -  set this to ``true``, if you want to use uniform drop
527

528
-  ``drop_seed`` :raw-html:`<a id="drop_seed" title="Permalink to this parameter" href="#drop_seed">&#x1F517;&#xFE0E;</a>`, default = ``4``, type = int
529

530
   -  used only in ``dart``
531

532
   -  random seed to choose dropping models
533

534
-  ``top_rate`` :raw-html:`<a id="top_rate" title="Permalink to this parameter" href="#top_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.2``, type = double, constraints: ``0.0 <= top_rate <= 1.0``
535

536
   -  used only in ``goss``
537

538
   -  the retain ratio of large gradient data
539

540
-  ``other_rate`` :raw-html:`<a id="other_rate" title="Permalink to this parameter" href="#other_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.1``, type = double, constraints: ``0.0 <= other_rate <= 1.0``
541

542
   -  used only in ``goss``
543

544
545
   -  the retain ratio of small gradient data

546
-  ``min_data_per_group`` :raw-html:`<a id="min_data_per_group" title="Permalink to this parameter" href="#min_data_per_group">&#x1F517;&#xFE0E;</a>`, default = ``100``, type = int, constraints: ``min_data_per_group > 0``
547

548
549
   -  used for the categorical features

550
   -  minimal number of data per categorical group
551

552
-  ``max_cat_threshold`` :raw-html:`<a id="max_cat_threshold" title="Permalink to this parameter" href="#max_cat_threshold">&#x1F517;&#xFE0E;</a>`, default = ``32``, type = int, constraints: ``max_cat_threshold > 0``
553

554
   -  used for the categorical features
555

556
557
558
   -  limit number of split points considered for categorical features. See `the documentation on how LightGBM finds optimal splits for categorical features <./Features.rst#optimal-split-for-categorical-features>`_ for more details

   -  can be used to speed up training
559

560
-  ``cat_l2`` :raw-html:`<a id="cat_l2" title="Permalink to this parameter" href="#cat_l2">&#x1F517;&#xFE0E;</a>`, default = ``10.0``, type = double, constraints: ``cat_l2 >= 0.0``
561
562

   -  used for the categorical features
Guolin Ke's avatar
Guolin Ke committed
563

564
   -  L2 regularization in categorical split
565

566
-  ``cat_smooth`` :raw-html:`<a id="cat_smooth" title="Permalink to this parameter" href="#cat_smooth">&#x1F517;&#xFE0E;</a>`, default = ``10.0``, type = double, constraints: ``cat_smooth >= 0.0``
567
568
569
570
571

   -  used for the categorical features

   -  this can reduce the effect of noises in categorical features, especially for categories with few data

572
-  ``max_cat_to_onehot`` :raw-html:`<a id="max_cat_to_onehot" title="Permalink to this parameter" href="#max_cat_to_onehot">&#x1F517;&#xFE0E;</a>`, default = ``4``, type = int, constraints: ``max_cat_to_onehot > 0``
573

574
575
   -  used for the categorical features

576
577
   -  when number of categories of one feature smaller than or equal to ``max_cat_to_onehot``, one-vs-other split algorithm will be used

578
-  ``top_k`` :raw-html:`<a id="top_k" title="Permalink to this parameter" href="#top_k">&#x1F517;&#xFE0E;</a>`, default = ``20``, type = int, aliases: ``topk``, constraints: ``top_k > 0``
579

580
   -  used only in ``voting`` tree learner, refer to `Voting parallel <./Parallel-Learning-Guide.rst#choose-appropriate-parallel-algorithm>`__
581
582

   -  set this to larger value for more accurate result, but it will slow down the training speed
583

584
-  ``monotone_constraints`` :raw-html:`<a id="monotone_constraints" title="Permalink to this parameter" href="#monotone_constraints">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-int, aliases: ``mc``, ``monotone_constraint``, ``monotonic_cst``
Guolin Ke's avatar
Guolin Ke committed
585

586
   -  used for constraints of monotonic features
Guolin Ke's avatar
Guolin Ke committed
587

588
   -  ``1`` means increasing, ``-1`` means decreasing, ``0`` means non-constraint
Guolin Ke's avatar
Guolin Ke committed
589

590
   -  you need to specify all features in order. For example, ``mc=-1,0,1`` means decreasing for the 1st feature, non-constraint for the 2nd feature and increasing for the 3rd feature
591

592
-  ``monotone_constraints_method`` :raw-html:`<a id="monotone_constraints_method" title="Permalink to this parameter" href="#monotone_constraints_method">&#x1F517;&#xFE0E;</a>`, default = ``basic``, type = enum, options: ``basic``, ``intermediate``, ``advanced``, aliases: ``monotone_constraining_method``, ``mc_method``
593
594
595
596
597

   -  used only if ``monotone_constraints`` is set

   -  monotone constraints method

598
      -  ``basic``, the most basic monotone constraints method. It does not slow down the training speed at all, but over-constrains the predictions
599

600
      -  ``intermediate``, a `more advanced method <https://hal.science/hal-02862802/document>`__, which may slow down the training speed very slightly. However, this method is much less constraining than the basic method and should significantly improve the results
601

602
      -  ``advanced``, an `even more advanced method <https://hal.science/hal-02862802/document>`__, which may slow down the training speed. However, this method is even less constraining than the intermediate method and should again significantly improve the results
603

604
605
606
607
-  ``monotone_penalty`` :raw-html:`<a id="monotone_penalty" title="Permalink to this parameter" href="#monotone_penalty">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``monotone_splits_penalty``, ``ms_penalty``, ``mc_penalty``, constraints: ``monotone_penalty >= 0.0``

   -  used only if ``monotone_constraints`` is set

608
   -  `monotone penalty <https://hal.science/hal-02862802/document>`__: a penalization parameter X forbids any monotone splits on the first X (rounded down) level(s) of the tree. The penalty applied to monotone splits on a given depth is a continuous, increasing function the penalization parameter
609
610
611

   -  if ``0.0`` (the default), no penalization is applied

612
-  ``feature_contri`` :raw-html:`<a id="feature_contri" title="Permalink to this parameter" href="#feature_contri">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-double, aliases: ``feature_contrib``, ``fc``, ``fp``, ``feature_penalty``
Guolin Ke's avatar
Guolin Ke committed
613
614
615
616
617

   -  used to control feature's split gain, will use ``gain[i] = max(0, feature_contri[i]) * gain[i]`` to replace the split gain of i-th feature

   -  you need to specify all features in order

618
-  ``forcedsplits_filename`` :raw-html:`<a id="forcedsplits_filename" title="Permalink to this parameter" href="#forcedsplits_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``fs``, ``forced_splits_filename``, ``forced_splits_file``, ``forced_splits``
619
620
621
622
623
624
625

   -  path to a ``.json`` file that specifies splits to force at the top of every decision tree before best-first learning commences

   -  ``.json`` file can be arbitrarily nested, and each split contains ``feature``, ``threshold`` fields, as well as ``left`` and ``right`` fields representing subsplits

   -  categorical splits are forced in a one-hot fashion, with ``left`` representing the split containing the feature value and ``right`` representing other values

626
627
   -  **Note**: the forced split logic will be ignored, if the split makes gain worse

628
   -  see `this file <https://github.com/microsoft/LightGBM/blob/master/examples/binary_classification/forced_splits.json>`__ as an example
Guolin Ke's avatar
Guolin Ke committed
629

Guolin Ke's avatar
Guolin Ke committed
630
631
632
633
634
635
-  ``refit_decay_rate`` :raw-html:`<a id="refit_decay_rate" title="Permalink to this parameter" href="#refit_decay_rate">&#x1F517;&#xFE0E;</a>`, default = ``0.9``, type = double, constraints: ``0.0 <= refit_decay_rate <= 1.0``

   -  decay rate of ``refit`` task, will use ``leaf_output = refit_decay_rate * old_leaf_output + (1.0 - refit_decay_rate) * new_leaf_output`` to refit trees

   -  used only in ``refit`` task in CLI version or as argument in ``refit`` function in language-specific package

636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
-  ``cegb_tradeoff`` :raw-html:`<a id="cegb_tradeoff" title="Permalink to this parameter" href="#cegb_tradeoff">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``cegb_tradeoff >= 0.0``

   -  cost-effective gradient boosting multiplier for all penalties

-  ``cegb_penalty_split`` :raw-html:`<a id="cegb_penalty_split" title="Permalink to this parameter" href="#cegb_penalty_split">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, constraints: ``cegb_penalty_split >= 0.0``

   -  cost-effective gradient-boosting penalty for splitting a node

-  ``cegb_penalty_feature_lazy`` :raw-html:`<a id="cegb_penalty_feature_lazy" title="Permalink to this parameter" href="#cegb_penalty_feature_lazy">&#x1F517;&#xFE0E;</a>`, default = ``0,0,...,0``, type = multi-double

   -  cost-effective gradient boosting penalty for using a feature

   -  applied per data point

-  ``cegb_penalty_feature_coupled`` :raw-html:`<a id="cegb_penalty_feature_coupled" title="Permalink to this parameter" href="#cegb_penalty_feature_coupled">&#x1F517;&#xFE0E;</a>`, default = ``0,0,...,0``, type = multi-double

   -  cost-effective gradient boosting penalty for using a feature

   -  applied once per forest

Belinda Trotta's avatar
Belinda Trotta committed
656
657
658
659
660
661
-  ``path_smooth`` :raw-html:`<a id="path_smooth" title="Permalink to this parameter" href="#path_smooth">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = double, constraints: ``path_smooth >=  0.0``

   -  controls smoothing applied to tree nodes

   -  helps prevent overfitting on leaves with few samples

662
   -  if ``0.0`` (the default), no smoothing is applied
Belinda Trotta's avatar
Belinda Trotta committed
663
664
665

   -  if ``path_smooth > 0`` then ``min_data_in_leaf`` must be at least ``2``

666
   -  larger values give stronger regularization
Belinda Trotta's avatar
Belinda Trotta committed
667

668
      -  the weight of each node is ``w * (n / path_smooth) / (n / path_smooth + 1) + w_p / (n / path_smooth + 1)``, where ``n`` is the number of samples in the node, ``w`` is the optimal node weight to minimise the loss (approximately ``-sum_gradients / sum_hessians``), and ``w_p`` is the weight of the parent node
Belinda Trotta's avatar
Belinda Trotta committed
669
670
671

      -  note that the parent output ``w_p`` itself has smoothing applied, unless it is the root node, so that the smoothing effect accumulates with the tree depth

672
673
674
675
676
677
678
679
680
681
-  ``interaction_constraints`` :raw-html:`<a id="interaction_constraints" title="Permalink to this parameter" href="#interaction_constraints">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string

   -  controls which features can appear in the same branch

   -  by default interaction constraints are disabled, to enable them you can specify

      -  for CLI, lists separated by commas, e.g. ``[0,1,2],[2,3]``

      -  for Python-package, list of lists, e.g. ``[[0, 1, 2], [2, 3]]``

682
      -  for R-package, list of character or numeric vectors, e.g. ``list(c("var1", "var2", "var3"), c("var3", "var4"))`` or ``list(c(1L, 2L, 3L), c(3L, 4L))``. Numeric vectors should use 1-based indexing, where ``1L`` is the first feature, ``2L`` is the second feature, etc.
683
684
685

   -  any two features can only appear in the same branch only if there exists a constraint containing both features

686
-  ``verbosity`` :raw-html:`<a id="verbosity" title="Permalink to this parameter" href="#verbosity">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``verbose``
687
688
689

   -  controls the level of LightGBM's verbosity

690
   -  ``< 0``: Fatal, ``= 0``: Error (Warning), ``= 1``: Info, ``> 1``: Debug
691

692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
-  ``input_model`` :raw-html:`<a id="input_model" title="Permalink to this parameter" href="#input_model">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``model_input``, ``model_in``

   -  filename of input model

   -  for ``prediction`` task, this model will be applied to prediction data

   -  for ``train`` task, training will be continued from this model

   -  **Note**: can be used only in CLI version

-  ``output_model`` :raw-html:`<a id="output_model" title="Permalink to this parameter" href="#output_model">&#x1F517;&#xFE0E;</a>`, default = ``LightGBM_model.txt``, type = string, aliases: ``model_output``, ``model_out``

   -  filename of output model in training

   -  **Note**: can be used only in CLI version
707
708
709
710
711
712
713
714

-  ``saved_feature_importance_type`` :raw-html:`<a id="saved_feature_importance_type" title="Permalink to this parameter" href="#saved_feature_importance_type">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int

   -  the feature importance type in the saved model file

   -  ``0``: count-based feature importance (numbers of splits are counted); ``1``: gain-based feature importance (values of gain are counted)

   -  **Note**: can be used only in CLI version
715
716
717
718
719
720
721
722
723

-  ``snapshot_freq`` :raw-html:`<a id="snapshot_freq" title="Permalink to this parameter" href="#snapshot_freq">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int, aliases: ``save_period``

   -  frequency of saving model file snapshot

   -  set this to positive value to enable this function. For example, the model file will be snapshotted at each iteration if ``snapshot_freq=1``

   -  **Note**: can be used only in CLI version

724
725
726
727
728
729
730
731
732
733
-  ``use_quantized_grad`` :raw-html:`<a id="use_quantized_grad" title="Permalink to this parameter" href="#use_quantized_grad">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  whether to use gradient quantization when training

   -  enabling this will discretize (quantize) the gradients and hessians into bins of ``num_grad_quant_bins``

   -  with quantized training, most arithmetics in the training process will be integer operations

   -  gradient quantization can accelerate training, with little accuracy drop in most cases

734
   -  **Note**: works only with ``cpu`` and ``cuda`` device type
735

736
737
   -  *New in version 4.0.0*

738
739
-  ``num_grad_quant_bins`` :raw-html:`<a id="num_grad_quant_bins" title="Permalink to this parameter" href="#num_grad_quant_bins">&#x1F517;&#xFE0E;</a>`, default = ``4``, type = int

740
741
   -  used only if ``use_quantized_grad=true``

742
743
744
745
   -  number of bins to quantization gradients and hessians

   -  with more bins, the quantized training will be closer to full precision training

746
   -  **Note**: works only with ``cpu`` and ``cuda`` device type
747

James Lamb's avatar
James Lamb committed
748
   -  *New in version 4.0.0*
749

750
751
-  ``quant_train_renew_leaf`` :raw-html:`<a id="quant_train_renew_leaf" title="Permalink to this parameter" href="#quant_train_renew_leaf">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

752
753
   -  used only if ``use_quantized_grad=true``

754
755
756
757
   -  whether to renew the leaf values with original gradients when quantized training

   -  renewing is very helpful for good quantized training accuracy for ranking objectives

758
   -  **Note**: works only with ``cpu`` and ``cuda`` device type
759

James Lamb's avatar
James Lamb committed
760
   -  *New in version 4.0.0*
761

762
763
-  ``stochastic_rounding`` :raw-html:`<a id="stochastic_rounding" title="Permalink to this parameter" href="#stochastic_rounding">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool

764
765
   -  used only if ``use_quantized_grad=true``

766
767
   -  whether to use stochastic rounding in gradient quantization

768
   -  **Note**: works only with ``cpu`` and ``cuda`` device type
769

James Lamb's avatar
James Lamb committed
770
   -  *New in version 4.0.0*
771

772
773
774
775
776
777
IO Parameters
-------------

Dataset Parameters
~~~~~~~~~~~~~~~~~~

Nikita Titov's avatar
Nikita Titov committed
778
779
780
781
-  ``linear_tree`` :raw-html:`<a id="linear_tree" title="Permalink to this parameter" href="#linear_tree">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``linear_trees``

   -  fit piecewise linear gradient boosting tree

782
   -  tree splits are chosen in the usual way, but the model at each leaf is linear instead of constant
Nikita Titov's avatar
Nikita Titov committed
783

784
   -  the linear model at each leaf includes all the numerical features in that leaf's branch
Nikita Titov's avatar
Nikita Titov committed
785

786
   -  the first tree has constant leaf values
787

788
   -  categorical features are used for splits as normal but are not used in the linear models
Nikita Titov's avatar
Nikita Titov committed
789

790
   -  missing values should not be encoded as ``0``. Use ``np.nan`` for Python, ``NA`` for the CLI, and ``NA``, ``NA_real_``, or ``NA_integer_`` for R
Nikita Titov's avatar
Nikita Titov committed
791

792
   -  it is recommended to rescale data before training so that features have similar mean and standard deviation
Nikita Titov's avatar
Nikita Titov committed
793

794
   -  **Note**: works only with ``cpu`` device type and ``serial`` tree learner
Nikita Titov's avatar
Nikita Titov committed
795

796
   -  **Note**: ``regression_l1`` objective is not supported with linear tree boosting
Nikita Titov's avatar
Nikita Titov committed
797

798
   -  **Note**: setting ``linear_tree=true`` significantly increases the memory use of LightGBM
Nikita Titov's avatar
Nikita Titov committed
799

800
   -  **Note**: if you specify ``monotone_constraints``, constraints will be enforced when choosing the split points, but not when fitting the linear models on leaves
Nikita Titov's avatar
Nikita Titov committed
801

802
-  ``max_bin`` :raw-html:`<a id="max_bin" title="Permalink to this parameter" href="#max_bin">&#x1F517;&#xFE0E;</a>`, default = ``255``, type = int, aliases: ``max_bins``, constraints: ``max_bin > 1``
803
804
805
806
807
808
809

   -  max number of bins that feature values will be bucketed in

   -  small number of bins may reduce training accuracy but may increase general power (deal with over-fitting)

   -  LightGBM will auto compress memory according to ``max_bin``. For example, LightGBM will use ``uint8_t`` for feature value if ``max_bin=255``

Belinda Trotta's avatar
Belinda Trotta committed
810
811
812
813
814
815
-  ``max_bin_by_feature`` :raw-html:`<a id="max_bin_by_feature" title="Permalink to this parameter" href="#max_bin_by_feature">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-int

   -  max number of bins for each feature

   -  if not specified, will use ``max_bin`` for all features

816
-  ``min_data_in_bin`` :raw-html:`<a id="min_data_in_bin" title="Permalink to this parameter" href="#min_data_in_bin">&#x1F517;&#xFE0E;</a>`, default = ``3``, type = int, constraints: ``min_data_in_bin > 0``
817
818
819
820

   -  minimal number of data inside one bin

   -  use this to avoid one-data-one-bin (potential over-fitting)
821

822
-  ``bin_construct_sample_cnt`` :raw-html:`<a id="bin_construct_sample_cnt" title="Permalink to this parameter" href="#bin_construct_sample_cnt">&#x1F517;&#xFE0E;</a>`, default = ``200000``, type = int, aliases: ``subsample_for_bin``, constraints: ``bin_construct_sample_cnt > 0``
823

824
   -  number of data that sampled to construct feature discrete bins
825

826
   -  setting this to larger value will give better training result, but may increase data loading time
827
828
829

   -  set this to larger value if data is very sparse

830
831
   -  **Note**: don't set this to small values, otherwise, you may encounter unexpected errors and poor accuracy

832
-  ``data_random_seed`` :raw-html:`<a id="data_random_seed" title="Permalink to this parameter" href="#data_random_seed">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``data_seed``
833

834
   -  random seed for sampling data to construct histogram bins
835

836
-  ``is_enable_sparse`` :raw-html:`<a id="is_enable_sparse" title="Permalink to this parameter" href="#is_enable_sparse">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool, aliases: ``is_sparse``, ``enable_sparse``, ``sparse``
837

838
   -  used to enable/disable sparse optimization
839

840
-  ``enable_bundle`` :raw-html:`<a id="enable_bundle" title="Permalink to this parameter" href="#enable_bundle">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool, aliases: ``is_enable_bundle``, ``bundle``
841

842
   -  set this to ``false`` to disable Exclusive Feature Bundling (EFB), which is described in `LightGBM: A Highly Efficient Gradient Boosting Decision Tree <https://papers.nips.cc/paper_files/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html>`__
843

844
   -  **Note**: disabling this may cause the slow training speed for sparse datasets
845

846
-  ``use_missing`` :raw-html:`<a id="use_missing" title="Permalink to this parameter" href="#use_missing">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
847

848
   -  set this to ``false`` to disable the special handle of missing value
849

850
-  ``zero_as_missing`` :raw-html:`<a id="zero_as_missing" title="Permalink to this parameter" href="#zero_as_missing">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
851

852
   -  set this to ``true`` to treat all zero as missing values (including the unshown values in LibSVM / sparse matrices)
853

854
   -  set this to ``false`` to use ``na`` for representing missing values
855

856
-  ``feature_pre_filter`` :raw-html:`<a id="feature_pre_filter" title="Permalink to this parameter" href="#feature_pre_filter">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
857

858
   -  set this to ``true`` (the default) to tell LightGBM to ignore the features that are unsplittable based on ``min_data_in_leaf``
859

860
   -  as dataset object is initialized only once and cannot be changed after that, you may need to set this to ``false`` when searching parameters with ``min_data_in_leaf``, otherwise features are filtered by ``min_data_in_leaf`` firstly if you don't reconstruct dataset object
861

862
   -  **Note**: setting this to ``false`` may slow down the training
863

864
-  ``pre_partition`` :raw-html:`<a id="pre_partition" title="Permalink to this parameter" href="#pre_partition">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_pre_partition``
865

866
   -  used for distributed learning (excluding the ``feature_parallel`` mode)
867
868
869

   -  ``true`` if training data are pre-partitioned, and different machines use different partitions

870
-  ``two_round`` :raw-html:`<a id="two_round" title="Permalink to this parameter" href="#two_round">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``two_round_loading``, ``use_two_round_loading``
871
872
873

   -  set this to ``true`` if data file is too big to fit in memory

874
875
   -  by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big

876
   -  **Note**: works only in case of loading data directly from text file
877

878
-  ``header`` :raw-html:`<a id="header" title="Permalink to this parameter" href="#header">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``has_header``
879
880
881

   -  set this to ``true`` if input data has header

882
   -  **Note**: works only in case of loading data directly from text file
883

884
-  ``label_column`` :raw-html:`<a id="label_column" title="Permalink to this parameter" href="#label_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``label``
885

886
   -  used to specify the label column
887
888
889
890
891

   -  use number for index, e.g. ``label=0`` means column\_0 is the label

   -  add a prefix ``name:`` for column name, e.g. ``label=name:is_click``

892
893
   -  if omitted, the first column in the training data is used as the label

894
   -  **Note**: works only in case of loading data directly from text file
895

896
-  ``weight_column`` :raw-html:`<a id="weight_column" title="Permalink to this parameter" href="#weight_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``weight``
897

898
   -  used to specify the weight column
899
900
901
902
903

   -  use number for index, e.g. ``weight=0`` means column\_0 is the weight

   -  add a prefix ``name:`` for column name, e.g. ``weight=name:weight``

904
   -  **Note**: works only in case of loading data directly from text file
905

906
   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
907

908
909
   -  **Note**: weights should be non-negative

910
-  ``group_column`` :raw-html:`<a id="group_column" title="Permalink to this parameter" href="#group_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``group``, ``group_id``, ``query_column``, ``query``, ``query_id``
911

912
   -  used to specify the query/group id column
913
914
915
916
917

   -  use number for index, e.g. ``query=0`` means column\_0 is the query id

   -  add a prefix ``name:`` for column name, e.g. ``query=name:query_id``

918
   -  **Note**: works only in case of loading data directly from text file
919

920
   -  **Note**: data should be grouped by query\_id, for more information, see `Query Data <#query-data>`__
921

922
   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
923

924
-  ``ignore_column`` :raw-html:`<a id="ignore_column" title="Permalink to this parameter" href="#ignore_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = multi-int or string, aliases: ``ignore_feature``, ``blacklist``
925
926

   -  used to specify some ignoring columns in training
927
928
929
930
931

   -  use number for index, e.g. ``ignore_column=0,1,2`` means column\_0, column\_1 and column\_2 will be ignored

   -  add a prefix ``name:`` for column name, e.g. ``ignore_column=name:c1,c2,c3`` means c1, c2 and c3 will be ignored

932
   -  **Note**: works only in case of loading data directly from text file
933

934
   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
935

936
937
   -  **Note**: despite the fact that specified columns will be completely ignored during the training, they still should have a valid format allowing LightGBM to load file successfully

938
-  ``categorical_feature`` :raw-html:`<a id="categorical_feature" title="Permalink to this parameter" href="#categorical_feature">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = multi-int or string, aliases: ``cat_feature``, ``categorical_column``, ``cat_column``, ``categorical_features``
939

940
   -  used to specify categorical features
941
942
943
944
945

   -  use number for index, e.g. ``categorical_feature=0,1,2`` means column\_0, column\_1 and column\_2 are categorical features

   -  add a prefix ``name:`` for column name, e.g. ``categorical_feature=name:c1,c2,c3`` means c1, c2 and c3 are categorical features

946
   -  **Note**: all values will be cast to ``int32`` (integer codes will be extracted from pandas categoricals in the Python-package)
947
948

   -  **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
949

950
951
   -  **Note**: all values should be less than ``Int32.MaxValue`` (2147483647)

952
   -  **Note**: using large values could be memory consuming. Tree decision rule works best when categorical features are presented by consecutive integers starting from zero
953

954
   -  **Note**: all negative values will be treated as **missing values**
955

956
957
   -  **Note**: the output cannot be monotonically constrained with respect to a categorical feature

958
959
   -  **Note**: floating point numbers in categorical features will be rounded towards 0

960
961
962
963
964
965
-  ``forcedbins_filename`` :raw-html:`<a id="forcedbins_filename" title="Permalink to this parameter" href="#forcedbins_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string

   -  path to a ``.json`` file that specifies bin upper bounds for some or all features

   -  ``.json`` file should contain an array of objects, each containing the word ``feature`` (integer feature index) and ``bin_upper_bound`` (array of thresholds for binning)

966
   -  see `this file <https://github.com/microsoft/LightGBM/blob/master/examples/regression/forced_bins.json>`__ as an example
967
968
969
970
971
972
973
974
975

-  ``save_binary`` :raw-html:`<a id="save_binary" title="Permalink to this parameter" href="#save_binary">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_save_binary``, ``is_save_binary_file``

   -  if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time

   -  **Note**: ``init_score`` is not saved in binary file

   -  **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent function

Chen Yufei's avatar
Chen Yufei committed
976
977
978
979
980
981
-  ``precise_float_parser`` :raw-html:`<a id="precise_float_parser" title="Permalink to this parameter" href="#precise_float_parser">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

   -  use precise floating point number parsing for text parser (e.g. CSV, TSV, LibSVM input)

   -  **Note**: setting this to ``true`` may lead to much slower text parsing

982
983
984
985
986
987
988
989
-  ``parser_config_file`` :raw-html:`<a id="parser_config_file" title="Permalink to this parameter" href="#parser_config_file">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string

   -  path to a ``.json`` file that specifies customized parser initialized configuration

   -  see `lightgbm-transform <https://github.com/microsoft/lightgbm-transform>`__ for usage examples

   -  **Note**: ``lightgbm-transform`` is not maintained by LightGBM's maintainers. Bug reports or feature requests should go to `issues page <https://github.com/microsoft/lightgbm-transform/issues>`__

James Lamb's avatar
James Lamb committed
990
   -  *New in version 4.0.0*
991

992
993
994
Predict Parameters
~~~~~~~~~~~~~~~~~~

995
996
997
998
999
1000
1001
1002
-  ``start_iteration_predict`` :raw-html:`<a id="start_iteration_predict" title="Permalink to this parameter" href="#start_iteration_predict">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int

   -  used only in ``prediction`` task

   -  used to specify from which iteration to start the prediction

   -  ``<= 0`` means from the first iteration

1003
1004
1005
1006
1007
1008
1009
1010
-  ``num_iteration_predict`` :raw-html:`<a id="num_iteration_predict" title="Permalink to this parameter" href="#num_iteration_predict">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int

   -  used only in ``prediction`` task

   -  used to specify how many trained iterations will be used in prediction

   -  ``<= 0`` means no limit

1011
-  ``predict_raw_score`` :raw-html:`<a id="predict_raw_score" title="Permalink to this parameter" href="#predict_raw_score">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_predict_raw_score``, ``predict_rawscore``, ``raw_score``
1012

1013
   -  used only in ``prediction`` task
1014

1015
   -  set this to ``true`` to predict only the raw scores
1016

1017
   -  set this to ``false`` to predict transformed scores
1018

1019
-  ``predict_leaf_index`` :raw-html:`<a id="predict_leaf_index" title="Permalink to this parameter" href="#predict_leaf_index">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_predict_leaf_index``, ``leaf_index``
1020

1021
   -  used only in ``prediction`` task
1022

1023
   -  set this to ``true`` to predict with leaf index of all trees
1024

1025
-  ``predict_contrib`` :raw-html:`<a id="predict_contrib" title="Permalink to this parameter" href="#predict_contrib">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_predict_contrib``, ``contrib``
1026

1027
   -  used only in ``prediction`` task
1028

1029
   -  set this to ``true`` to estimate `SHAP values <https://arxiv.org/abs/1706.06060>`__, which represent how each feature contributes to each prediction
1030

1031
   -  produces ``#features + 1`` values where the last value is the expected value of the model output over the training data
1032

1033
   -  **Note**: if you want to get more explanation for your model's predictions using SHAP values like SHAP interaction values, you can install `shap package <https://github.com/shap>`__
1034

Nikita Titov's avatar
Nikita Titov committed
1035
   -  **Note**: unlike the shap package, with ``predict_contrib`` we return a matrix with an extra column, where the last column is the expected value
1036

1037
1038
   -  **Note**: this feature is not implemented for linear trees

1039
-  ``predict_disable_shape_check`` :raw-html:`<a id="predict_disable_shape_check" title="Permalink to this parameter" href="#predict_disable_shape_check">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
1040

1041
   -  used only in ``prediction`` task
1042

1043
   -  control whether or not LightGBM raises an error when you try to predict on data with a different number of features than the training data
1044

1045
1046
1047
1048
1049
   -  if ``false`` (the default), a fatal error will be raised if the number of features in the dataset you predict on differs from the number seen during training

   -  if ``true``, LightGBM will attempt to predict on whatever data you provide. This is dangerous because you might get incorrect predictions, but you could use it in situations where it is difficult or expensive to generate some features and you are very confident that they were never chosen for splits in the model

   -  **Note**: be very careful setting this parameter to ``true``
1050

1051
-  ``pred_early_stop`` :raw-html:`<a id="pred_early_stop" title="Permalink to this parameter" href="#pred_early_stop">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
1052

1053
   -  used only in ``prediction`` task
1054

1055
1056
   -  used only in ``classification`` and ``ranking`` applications

1057
1058
   -  used only for predicting normal or raw scores

1059
   -  if ``true``, will use early-stopping to speed up the prediction. May affect the accuracy
1060

1061
1062
   -  **Note**: cannot be used with ``rf`` boosting type or custom objective function

1063
-  ``pred_early_stop_freq`` :raw-html:`<a id="pred_early_stop_freq" title="Permalink to this parameter" href="#pred_early_stop_freq">&#x1F517;&#xFE0E;</a>`, default = ``10``, type = int
1064

1065
   -  used only in ``prediction`` task and if ``pred_early_stop=true``
1066
1067
1068

   -  the frequency of checking early-stopping prediction

1069
-  ``pred_early_stop_margin`` :raw-html:`<a id="pred_early_stop_margin" title="Permalink to this parameter" href="#pred_early_stop_margin">&#x1F517;&#xFE0E;</a>`, default = ``10.0``, type = double
1070

1071
   -  used only in ``prediction`` task and if ``pred_early_stop=true``
1072
1073
1074

   -  the threshold of margin in early-stopping prediction

1075
-  ``output_result`` :raw-html:`<a id="output_result" title="Permalink to this parameter" href="#output_result">&#x1F517;&#xFE0E;</a>`, default = ``LightGBM_predict_result.txt``, type = string, aliases: ``predict_result``, ``prediction_result``, ``predict_name``, ``prediction_name``, ``pred_name``, ``name_pred``
1076
1077
1078

   -  used only in ``prediction`` task

1079
   -  filename of prediction result
1080

1081
   -  **Note**: can be used only in CLI version
1082

1083
1084
Convert Parameters
~~~~~~~~~~~~~~~~~~
1085

1086
-  ``convert_model_language`` :raw-html:`<a id="convert_model_language" title="Permalink to this parameter" href="#convert_model_language">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string
1087

1088
   -  used only in ``convert_model`` task
1089

1090
   -  only ``cpp`` is supported yet; for conversion model to other languages consider using `m2cgen <https://github.com/BayesWitnesses/m2cgen>`__ utility
1091

1092
   -  if ``convert_model_language`` is set and ``task=train``, the model will be also converted
1093

1094
1095
   -  **Note**: can be used only in CLI version

1096
-  ``convert_model`` :raw-html:`<a id="convert_model" title="Permalink to this parameter" href="#convert_model">&#x1F517;&#xFE0E;</a>`, default = ``gbdt_prediction.cpp``, type = string, aliases: ``convert_model_file``
1097

1098
   -  used only in ``convert_model`` task
1099

1100
   -  output filename of converted model
1101

1102
1103
   -  **Note**: can be used only in CLI version

1104
1105
Objective Parameters
--------------------
1106

1107
1108
-  ``objective_seed`` :raw-html:`<a id="objective_seed" title="Permalink to this parameter" href="#objective_seed">&#x1F517;&#xFE0E;</a>`, default = ``5``, type = int

1109
   -  used only in ``rank_xendcg`` objective
1110

1111
   -  random seed for objectives, if random process is needed
1112

1113
-  ``num_class`` :raw-html:`<a id="num_class" title="Permalink to this parameter" href="#num_class">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_classes``, constraints: ``num_class > 0``
1114

1115
   -  used only in ``multi-class`` classification application
1116

1117
-  ``is_unbalance`` :raw-html:`<a id="is_unbalance" title="Permalink to this parameter" href="#is_unbalance">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``unbalance``, ``unbalanced_sets``
1118

1119
   -  used only in ``binary`` and ``multiclassova`` applications
1120

1121
   -  set this to ``true`` if training data are unbalanced
1122

1123
1124
   -  **Note**: while enabling this should increase the overall performance metric of your model, it will also result in poor estimates of the individual class probabilities

1125
   -  **Note**: this parameter cannot be used at the same time with ``scale_pos_weight``, choose only **one** of them
1126

1127
-  ``scale_pos_weight`` :raw-html:`<a id="scale_pos_weight" title="Permalink to this parameter" href="#scale_pos_weight">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``scale_pos_weight > 0.0``
1128

1129
   -  used only in ``binary`` and ``multiclassova`` applications
1130

1131
   -  weight of labels with positive class
1132

1133
1134
   -  **Note**: while enabling this should increase the overall performance metric of your model, it will also result in poor estimates of the individual class probabilities

1135
   -  **Note**: this parameter cannot be used at the same time with ``is_unbalance``, choose only **one** of them
1136

1137
-  ``sigmoid`` :raw-html:`<a id="sigmoid" title="Permalink to this parameter" href="#sigmoid">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``sigmoid > 0.0``
1138

1139
   -  used only in ``binary`` and ``multiclassova`` classification and in ``lambdarank`` applications
1140

1141
   -  parameter for the sigmoid function
1142

1143
-  ``boost_from_average`` :raw-html:`<a id="boost_from_average" title="Permalink to this parameter" href="#boost_from_average">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
1144

1145
   -  used only in ``regression``, ``binary``, ``multiclassova`` and ``cross-entropy`` applications
1146

1147
   -  adjusts initial score to the mean of labels for faster convergence
1148

1149
-  ``reg_sqrt`` :raw-html:`<a id="reg_sqrt" title="Permalink to this parameter" href="#reg_sqrt">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
1150

1151
   -  used only in ``regression`` application
1152

1153
   -  used to fit ``sqrt(label)`` instead of original values and prediction result will be also automatically converted to ``prediction^2``
1154

1155
   -  might be useful in case of large-range labels
1156

1157
-  ``alpha`` :raw-html:`<a id="alpha" title="Permalink to this parameter" href="#alpha">&#x1F517;&#xFE0E;</a>`, default = ``0.9``, type = double, constraints: ``alpha > 0.0``
1158

1159
   -  used only in ``huber`` and ``quantile`` ``regression`` applications
1160

1161
   -  parameter for `Huber loss <https://en.wikipedia.org/wiki/Huber_loss>`__ and `Quantile regression <https://en.wikipedia.org/wiki/Quantile_regression>`__
1162

1163
-  ``fair_c`` :raw-html:`<a id="fair_c" title="Permalink to this parameter" href="#fair_c">&#x1F517;&#xFE0E;</a>`, default = ``1.0``, type = double, constraints: ``fair_c > 0.0``
1164

1165
   -  used only in ``fair`` ``regression`` application
1166

1167
   -  parameter for `Fair loss <https://www.kaggle.com/c/allstate-claims-severity/discussion/24520>`__
1168

1169
-  ``poisson_max_delta_step`` :raw-html:`<a id="poisson_max_delta_step" title="Permalink to this parameter" href="#poisson_max_delta_step">&#x1F517;&#xFE0E;</a>`, default = ``0.7``, type = double, constraints: ``poisson_max_delta_step > 0.0``
1170

1171
   -  used only in ``poisson`` ``regression`` application
1172

1173
1174
   -  parameter for `Poisson regression <https://en.wikipedia.org/wiki/Poisson_regression>`__ to safeguard optimization

1175
-  ``tweedie_variance_power`` :raw-html:`<a id="tweedie_variance_power" title="Permalink to this parameter" href="#tweedie_variance_power">&#x1F517;&#xFE0E;</a>`, default = ``1.5``, type = double, constraints: ``1.0 <= tweedie_variance_power < 2.0``
1176
1177
1178
1179
1180
1181

   -  used only in ``tweedie`` ``regression`` application

   -  used to control the variance of the tweedie distribution

   -  set this closer to ``2`` to shift towards a **Gamma** distribution
1182

1183
   -  set this closer to ``1`` to shift towards a **Poisson** distribution
1184

1185
-  ``lambdarank_truncation_level`` :raw-html:`<a id="lambdarank_truncation_level" title="Permalink to this parameter" href="#lambdarank_truncation_level">&#x1F517;&#xFE0E;</a>`, default = ``30``, type = int, constraints: ``lambdarank_truncation_level > 0``
1186

1187
   -  used only in ``lambdarank`` application
1188

Nikita Titov's avatar
Nikita Titov committed
1189
   -  controls the number of top-results to focus on during training, refer to "truncation level" in the Sec. 3 of `LambdaMART paper <https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR-TR-2010-82.pdf>`__
1190

Nikita Titov's avatar
Nikita Titov committed
1191
   -  this parameter is closely related to the desirable cutoff ``k`` in the metric **NDCG@k** that we aim at optimizing the ranker for. The optimal setting for this parameter is likely to be slightly higher than ``k`` (e.g., ``k + 3``) to include more pairs of documents to train on, but perhaps not too high to avoid deviating too much from the desired target metric **NDCG@k**
1192

1193
-  ``lambdarank_norm`` :raw-html:`<a id="lambdarank_norm" title="Permalink to this parameter" href="#lambdarank_norm">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
1194
1195
1196
1197
1198

   -  used only in ``lambdarank`` application

   -  set this to ``true`` to normalize the lambdas for different queries, and improve the performance for unbalanced data

1199
   -  set this to ``false`` to enforce the original lambdarank algorithm
1200

1201
-  ``label_gain`` :raw-html:`<a id="label_gain" title="Permalink to this parameter" href="#label_gain">&#x1F517;&#xFE0E;</a>`, default = ``0,1,3,7,15,31,63,...,2^30-1``, type = multi-double
1202

1203
   -  used only in ``lambdarank`` application
Nikita Titov's avatar
Nikita Titov committed
1204

1205
   -  relevant gain for labels. For example, the gain of label ``2`` is ``3`` in case of default label gains
Nikita Titov's avatar
Nikita Titov committed
1206

1207
   -  separate by ``,``
Guolin Ke's avatar
Guolin Ke committed
1208

1209
1210
-  ``lambdarank_position_bias_regularization`` :raw-html:`<a id="lambdarank_position_bias_regularization" title="Permalink to this parameter" href="#lambdarank_position_bias_regularization">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, constraints: ``lambdarank_position_bias_regularization >= 0.0``

1211
1212
1213
   -  used only in ``lambdarank`` application when positional information is provided and position bias is modeled

   -  larger values reduce the inferred position bias factors
1214

James Lamb's avatar
James Lamb committed
1215
1216
   -  *New in version 4.1.0*

1217
1218
1219
Metric Parameters
-----------------

1220
-  ``metric`` :raw-html:`<a id="metric" title="Permalink to this parameter" href="#metric">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = multi-enum, aliases: ``metrics``, ``metric_types``
1221

1222
   -  metric(s) to be evaluated on the evaluation set(s)
1223

1224
      -  ``""`` (empty string or not specified) means that metric corresponding to specified ``objective`` will be used (this is possible only for pre-defined objective functions, otherwise no evaluation metric will be added)
1225

1226
      -  ``"None"`` (string, **not** a ``None`` value) means that no metric will be registered, aliases: ``na``, ``null``, ``custom``
1227
1228
1229
1230
1231

      -  ``l1``, absolute loss, aliases: ``mean_absolute_error``, ``mae``, ``regression_l1``

      -  ``l2``, square loss, aliases: ``mean_squared_error``, ``mse``, ``regression_l2``, ``regression``

1232
      -  ``rmse``, root square loss, aliases: ``root_mean_squared_error``, ``l2_root``
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249

      -  ``quantile``, `Quantile regression <https://en.wikipedia.org/wiki/Quantile_regression>`__

      -  ``mape``, `MAPE loss <https://en.wikipedia.org/wiki/Mean_absolute_percentage_error>`__, aliases: ``mean_absolute_percentage_error``

      -  ``huber``, `Huber loss <https://en.wikipedia.org/wiki/Huber_loss>`__

      -  ``fair``, `Fair loss <https://www.kaggle.com/c/allstate-claims-severity/discussion/24520>`__

      -  ``poisson``, negative log-likelihood for `Poisson regression <https://en.wikipedia.org/wiki/Poisson_regression>`__

      -  ``gamma``, negative log-likelihood for **Gamma** regression

      -  ``gamma_deviance``, residual deviance for **Gamma** regression

      -  ``tweedie``, negative log-likelihood for **Tweedie** regression

1250
      -  ``ndcg``, `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__, aliases: ``lambdarank``, ``rank_xendcg``, ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart``
1251
1252
1253
1254
1255

      -  ``map``, `MAP <https://makarandtapaswi.wordpress.com/2012/07/02/intuition-behind-average-precision-and-map/>`__, aliases: ``mean_average_precision``

      -  ``auc``, `AUC <https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve>`__

1256
1257
      -  ``average_precision``, `average precision score <https://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html>`__

1258
1259
      -  ``binary_logloss``, `log loss <https://en.wikipedia.org/wiki/Cross_entropy>`__, aliases: ``binary``

Misha Lisovyi's avatar
Misha Lisovyi committed
1260
      -  ``binary_error``, for one sample: ``0`` for correct classification, ``1`` for error classification
1261

Belinda Trotta's avatar
Belinda Trotta committed
1262
1263
      -  ``auc_mu``, `AUC-mu <http://proceedings.mlr.press/v97/kleiman19a/kleiman19a.pdf>`__

1264
1265
1266
1267
      -  ``multi_logloss``, log loss for multi-class classification, aliases: ``multiclass``, ``softmax``, ``multiclassova``, ``multiclass_ova``, ``ova``, ``ovr``

      -  ``multi_error``, error rate for multi-class classification

Guolin Ke's avatar
Guolin Ke committed
1268
      -  ``cross_entropy``, cross-entropy (with optional linear weights), aliases: ``xentropy``
1269

Guolin Ke's avatar
Guolin Ke committed
1270
      -  ``cross_entropy_lambda``, "intensity-weighted" cross-entropy, aliases: ``xentlambda``
1271

Guolin Ke's avatar
Guolin Ke committed
1272
      -  ``kullback_leibler``, `Kullback-Leibler divergence <https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence>`__, aliases: ``kldiv``
1273

Misha Lisovyi's avatar
Misha Lisovyi committed
1274
   -  support multiple metrics, separated by ``,``
1275

1276
-  ``metric_freq`` :raw-html:`<a id="metric_freq" title="Permalink to this parameter" href="#metric_freq">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``output_freq``, constraints: ``metric_freq > 0``
1277
1278
1279

   -  frequency for metric output

1280
1281
   -  **Note**: can be used only in CLI version

1282
-  ``is_provide_training_metric`` :raw-html:`<a id="is_provide_training_metric" title="Permalink to this parameter" href="#is_provide_training_metric">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``training_metric``, ``is_training_metric``, ``train_metric``
1283

1284
   -  set this to ``true`` to output metric result over training dataset
1285

1286
1287
   -  **Note**: can be used only in CLI version

1288
-  ``eval_at`` :raw-html:`<a id="eval_at" title="Permalink to this parameter" href="#eval_at">&#x1F517;&#xFE0E;</a>`, default = ``1,2,3,4,5``, type = multi-int, aliases: ``ndcg_eval_at``, ``ndcg_at``, ``map_eval_at``, ``map_at``
1289

1290
1291
   -  used only with ``ndcg`` and ``map`` metrics

1292
   -  `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__ and `MAP <https://makarandtapaswi.wordpress.com/2012/07/02/intuition-behind-average-precision-and-map/>`__ evaluation positions, separated by ``,``
1293

Belinda Trotta's avatar
Belinda Trotta committed
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
-  ``multi_error_top_k`` :raw-html:`<a id="multi_error_top_k" title="Permalink to this parameter" href="#multi_error_top_k">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, constraints: ``multi_error_top_k > 0``

   -  used only with ``multi_error`` metric

   -  threshold for top-k multi-error metric

   -  the error on each sample is ``0`` if the true class is among the top ``multi_error_top_k`` predictions, and ``1`` otherwise

      -  more precisely, the error on a sample is ``0`` if there are at least ``num_classes - multi_error_top_k`` predictions strictly less than the prediction on the true class

   -  when ``multi_error_top_k=1`` this is equivalent to the usual multi-error metric

Belinda Trotta's avatar
Belinda Trotta committed
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
-  ``auc_mu_weights`` :raw-html:`<a id="auc_mu_weights" title="Permalink to this parameter" href="#auc_mu_weights">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-double

   -  used only with ``auc_mu`` metric

   -  list representing flattened matrix (in row-major order) giving loss weights for classification errors

   -  list should have ``n * n`` elements, where ``n`` is the number of classes

   -  the matrix co-ordinate ``[i, j]`` should correspond to the ``i * n + j``-th element of the list

   -  if not specified, will use equal weights for all classes

1318
1319
1320
Network Parameters
------------------

1321
-  ``num_machines`` :raw-html:`<a id="num_machines" title="Permalink to this parameter" href="#num_machines">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_machine``, constraints: ``num_machines > 0``
1322

1323
   -  the number of machines for distributed learning application
1324

1325
   -  this parameter is needed to be set in both **socket** and **MPI** versions
1326

1327
-  ``local_listen_port`` :raw-html:`<a id="local_listen_port" title="Permalink to this parameter" href="#local_listen_port">&#x1F517;&#xFE0E;</a>`, default = ``12400 (random for Dask-package)``, type = int, aliases: ``local_port``, ``port``, constraints: ``local_listen_port > 0``
1328
1329
1330

   -  TCP listen port for local machines

1331
   -  **Note**: don't forget to allow this port in firewall settings before training
1332

1333
-  ``time_out`` :raw-html:`<a id="time_out" title="Permalink to this parameter" href="#time_out">&#x1F517;&#xFE0E;</a>`, default = ``120``, type = int, constraints: ``time_out > 0``
1334
1335
1336

   -  socket time-out in minutes

1337
-  ``machine_list_filename`` :raw-html:`<a id="machine_list_filename" title="Permalink to this parameter" href="#machine_list_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``machine_list_file``, ``machine_list``, ``mlist``
1338

1339
   -  path of file that lists machines for this distributed learning application
1340

1341
   -  each line contains one IP and one port for one machine. The format is ``ip port`` (space as a separator)
1342

1343
1344
   -  **Note**: can be used only in CLI version

1345
-  ``machines`` :raw-html:`<a id="machines" title="Permalink to this parameter" href="#machines">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``workers``, ``nodes``
1346
1347

   -  list of machines in the following format: ``ip1:port1,ip2:port2``
1348
1349
1350
1351

GPU Parameters
--------------

1352
-  ``gpu_platform_id`` :raw-html:`<a id="gpu_platform_id" title="Permalink to this parameter" href="#gpu_platform_id">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int
1353

1354
1355
   -  used only with ``gpu`` device type

1356
   -  OpenCL platform ID. Usually each GPU vendor exposes one OpenCL platform
1357

1358
   -  ``-1`` means the system-wide default platform
1359

1360
1361
   -  **Note**: refer to `GPU Targets <./GPU-Targets.rst#query-opencl-devices-in-your-system>`__ for more details

1362
-  ``gpu_device_id`` :raw-html:`<a id="gpu_device_id" title="Permalink to this parameter" href="#gpu_device_id">&#x1F517;&#xFE0E;</a>`, default = ``-1``, type = int
1363

1364
   -  OpenCL device ID in the specified platform or CUDA device ID. Each GPU in the selected platform has a unique device ID
1365

1366
   -  ``-1`` means the default device in the selected platform
1367

1368
1369
   -  **Note**: refer to `GPU Targets <./GPU-Targets.rst#query-opencl-devices-in-your-system>`__ for more details

1370
-  ``gpu_use_dp`` :raw-html:`<a id="gpu_use_dp" title="Permalink to this parameter" href="#gpu_use_dp">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
1371

1372
1373
   -  set this to ``true`` to use double precision math on GPU (by default single precision is used)

1374
   -  **Note**: can be used only in OpenCL implementation (``device_type="gpu"``), in CUDA implementation only double precision is currently supported
1375
1376
1377
1378
1379

-  ``num_gpu`` :raw-html:`<a id="num_gpu" title="Permalink to this parameter" href="#num_gpu">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, constraints: ``num_gpu > 0``

   -  number of GPUs

1380
   -  **Note**: can be used only in CUDA implementation (``device_type="cuda"``)
1381

1382
1383
.. end params list

1384
1385
1386
1387
1388
1389
Others
------

Continued Training with Input Score
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1390
1391
LightGBM supports continued training with initial scores.
It uses an additional file to store these initial scores, like the following:
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401

::

    0.5
    -0.1
    0.9
    ...

It means the initial score of the first data row is ``0.5``, second is ``-0.1``, and so on.
The initial score file corresponds with data file line by line, and has per score per line.
1402

1403
If the name of data file is ``train.txt``, the initial score file should be named as ``train.txt.init`` and placed in the same folder as the data file.
1404
In this case, LightGBM will auto load initial score file if it exists.
1405

1406
1407
If binary data files exist for raw data file ``train.txt``, for example in the name ``train.txt.bin``, then the initial score file should be named as ``train.txt.bin.init``.

1408
1409
1410
Weight Data
~~~~~~~~~~~

1411
1412
LightGBM supports weighted training.
It uses an additional file to store weight data, like the following:
1413
1414
1415
1416
1417
1418
1419
1420

::

    1.0
    0.5
    0.8
    ...

1421
1422
It means the weight of the first data row is ``1.0``, second is ``0.5``, and so on. Weights should be non-negative.

1423
The weight file corresponds with data file line by line, and has per weight per line.
1424

1425
And if the name of data file is ``train.txt``, the weight file should be named as ``train.txt.weight`` and placed in the same folder as the data file.
1426
In this case, LightGBM will load the weight file automatically if it exists.
1427

1428
1429
Also, you can include weight column in your data file.
Please refer to the ``weight_column`` `parameter <#weight_column>`__ in above.
1430
1431
1432
1433

Query Data
~~~~~~~~~~

1434
For learning to rank, it needs query information for training data.
1435

Nikita Titov's avatar
Nikita Titov committed
1436
LightGBM uses an additional file to store query data, like the following:
1437
1438
1439
1440
1441
1442
1443
1444

::

    27
    18
    67
    ...

1445
1446
1447
1448
For wrapper libraries like in Python and R, this information can also be provided as an array-like via the Dataset parameter ``group``.

::

1449
    [27, 18, 67, ...]
1450
1451

For example, if you have a 112-document dataset with ``group = [27, 18, 67]``, that means that you have 3 groups, where the first 27 records are in the first group, records 28-45 are in the second group, and records 46-112 are in the third group.
1452
1453
1454

**Note**: data should be ordered by the query.

1455
If the name of data file is ``train.txt``, the query file should be named as ``train.txt.query`` and placed in the same folder as the data file.
1456
In this case, LightGBM will load the query file automatically if it exists.
1457

1458
1459
Also, you can include query/group id column in your data file.
Please refer to the ``group_column`` `parameter <#group_column>`__ in above.