"vscode:/vscode.git/clone" did not exist on "ff2c62a48cd565bbf8daceb853ff68cc8c2966df"
Advanced-Topic.md 1.87 KB
Newer Older
Guolin Ke's avatar
Guolin Ke committed
1
2
# Advanced Topics

3
## Missing Value Handle
Guolin Ke's avatar
Guolin Ke committed
4
5
6

* LightGBM enables the missing value handle by default, you can disable it by set ```use_missing=false```.
* LightGBM uses NA (NAN) to represent the missing value by default, you can change it to use zero by set ```zero_as_missing=true```.
7
8
* When ```zero_as_missing=false``` (default), the unshown value in sparse matrices (and LightSVM) is treated as zeros.
* When ```zero_as_missing=true```, NA and zeros (including unshown value in sparse matrices (and LightSVM)) are treated as missing.
Guolin Ke's avatar
Guolin Ke committed
9

10
## Categorical Feature Support
Guolin Ke's avatar
Guolin Ke committed
11

12
13
14
* LightGBM can offer a good accuracy when using native categorical features. Not like simply one-hot coding, LightGBM can find the optimal split of categorical features. Such an optimal split can provide the much better accuracy than one-hot coding solution.
* Use `categorical_feature` to specify the categorical features. Refer to the parameter `categorical_feature` in [Parameters](./Parameters.md).
* Converting to `int` type is needed first, and there is support for non-negative numbers only. It is better to convert into continues ranges.
Guolin Ke's avatar
Guolin Ke committed
15
* Use `max_cat_group`, `cat_smooth_ratio` to deal with over-fitting (when #data is small or #category is large).
16
* For categorical features with high cardinality (#category is large), it is better to convert it to numerical features.
Guolin Ke's avatar
Guolin Ke committed
17

18
## LambdaRank
Guolin Ke's avatar
Guolin Ke committed
19

20
* The label should be `int` type, and larger numbers represent the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect).
Guolin Ke's avatar
Guolin Ke committed
21
22
23
24
25
* Use `label_gain` to set the gain(weight) of `int` label.
* Use `max_position` to set the NDCG optimization position.

## Parameters Tuning

26
* Refer to [Parameters Tuning](./Parameters-tuning.md).
Guolin Ke's avatar
Guolin Ke committed
27

28
## GPU Support
Guolin Ke's avatar
Guolin Ke committed
29

30
* Refer to [GPU Tutorial](./GPU-Tutorial.md) and [GPU Targets](./GPU-Targets.rst).
Guolin Ke's avatar
Guolin Ke committed
31

32
## Parallel Learning
Guolin Ke's avatar
Guolin Ke committed
33

34
* Refer to [Parallel Learning Guide](https://github.com/Microsoft/LightGBM/wiki/Parallel-Learning-Guide).