Advanced-Topics.rst 3.42 KB
Newer Older
1
2
3
4
5
6
Advanced Topics
===============

Missing Value Handle
--------------------

7
-  LightGBM enables the missing value handle by default. Disable it by setting ``use_missing=false``.
8

9
-  LightGBM uses NA (NaN) to represent missing values by default. Change it to use zero by setting ``zero_as_missing=true``.
10

Andrew Ziem's avatar
Andrew Ziem committed
11
-  When ``zero_as_missing=false`` (default), the unrecorded values in sparse matrices (and LightSVM) are treated as zeros.
12

Andrew Ziem's avatar
Andrew Ziem committed
13
-  When ``zero_as_missing=true``, NA and zeros (including unrecorded values in sparse matrices (and LightSVM)) are treated as missing.
14
15
16
17

Categorical Feature Support
---------------------------

18
-  LightGBM offers good accuracy with integer-encoded categorical features. LightGBM applies
19
   `Fisher (1958) <https://www.tandfonline.com/doi/abs/10.1080/01621459.1958.10501479>`_
20
21
   to find the optimal split over categories as
   `described here <./Features.rst#optimal-split-for-categorical-features>`_. This often performs better than one-hot encoding.
22
23

-  Use ``categorical_feature`` to specify the categorical features.
24
   Refer to the parameter ``categorical_feature`` in `Parameters <./Parameters.rst#categorical_feature>`__.
25

26
-  Categorical features must be encoded as non-negative integers (``int``) less than ``Int32.MaxValue`` (2147483647).
27
   It is best to use a contiguous range of integers started from zero.
28

29
-  Use ``min_data_per_group``, ``cat_smooth`` to deal with over-fitting (when ``#data`` is small or ``#category`` is large).
30

31
32
33
-  For a categorical feature with high cardinality (``#category`` is large), it often works best to
   treat the feature as numeric, either by simply ignoring the categorical interpretation of the integers or
   by embedding the categories in a low-dimensional numeric space.
34
35
36
37

LambdaRank
----------

38
-  The label should be of type ``int``, such that larger numbers correspond to higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect).
39
40
41

-  Use ``label_gain`` to set the gain(weight) of ``int`` label.

42
-  Use ``lambdarank_truncation_level`` to truncate the max DCG.
43

44
45
46
47
48
49
50
51
52
53
54
55
56
Cost Efficient Gradient Boosting
--------------------------------

`Cost Efficient Gradient Boosting <https://papers.nips.cc/paper/6753-cost-efficient-gradient-boosting.pdf>`_ (CEGB)  makes it possible to penalise boosting based on the cost of obtaining feature values.
CEGB penalises learning in the following ways:

- Each time a tree is split, a penalty of ``cegb_penalty_split`` is applied.
- When a feature is used for the first time, ``cegb_penalty_feature_coupled`` is applied. This penalty can be different for each feature and should be specified as one ``double`` per feature.
- When a feature is used for the first time for a data row, ``cegb_penalty_feature_lazy`` is applied. Like ``cegb_penalty_feature_coupled``, this penalty is specified as one ``double`` per feature.

Each of the penalties above is scaled by ``cegb_tradeoff``.
Using this parameter, it is possible to change the overall strength of the CEGB penalties by changing only one parameter.

57
58
59
60
61
Parameters Tuning
-----------------

-  Refer to `Parameters Tuning <./Parameters-Tuning.rst>`__.

62
63
64
65
.. _Parallel Learning:

Distributed Learning
--------------------
66

67
-  Refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__.
68
69
70
71
72
73
74
75
76
77

GPU Support
-----------

-  Refer to `GPU Tutorial <./GPU-Tutorial.rst>`__ and `GPU Targets <./GPU-Targets.rst>`__.

Recommendations for gcc Users (MinGW, \*nix)
--------------------------------------------

-  Refer to `gcc Tips <./gcc-Tips.rst>`__.