- ``device_type`` :raw-html:`<a id="device_type" title="Permalink to this parameter" href="#device_type">🔗︎</a>`, default = ``cpu``, type = enum, options: ``cpu``, ``gpu``, aliases: ``device``
- device for the tree learning, you can use GPU to achieve the faster learning
...
...
@@ -174,7 +177,7 @@ Core Parameters
- **Note**: refer to `Installation Guide <./Installation-Guide.rst#build-gpu-version>`__ to build LightGBM with GPU support
- ``seed``, default = ``0``, type = int, aliases: ``random_seed``
- ``seed`` :raw-html:`<a id="seed" title="Permalink to this parameter" href="#seed">🔗︎</a>`, default = ``0``, type = int, aliases: ``random_seed``
- this seed is used to generate other seeds, e.g. ``data_random_seed``, ``feature_fraction_seed``
...
...
@@ -183,21 +186,21 @@ Core Parameters
Learning Control Parameters
---------------------------
- ``max_depth``, default = ``-1``, type = int
- ``max_depth`` :raw-html:`<a id="max_depth" title="Permalink to this parameter" href="#max_depth">🔗︎</a>`, default = ``-1``, type = int
- limit the max depth for tree model. This is used to deal with over-fitting when ``#data`` is small. Tree still grows leaf-wise
- ``feature_fraction`` :raw-html:`<a id="feature_fraction" title="Permalink to this parameter" href="#feature_fraction">🔗︎</a>`, default = ``1.0``, type = double, aliases: ``sub_feature``, ``colsample_bytree``, constraints: ``0.0 < feature_fraction <= 1.0``
- LightGBM will randomly select part of features on each iteration if ``feature_fraction`` smaller than ``1.0``. For example, if you set it to ``0.8``, LightGBM will select 80% of features before training each tree
...
...
@@ -227,17 +230,17 @@ Learning Control Parameters
- can be used to deal with over-fitting
- ``feature_fraction_seed``, default = ``2``, type = int
- ``feature_fraction_seed`` :raw-html:`<a id="feature_fraction_seed" title="Permalink to this parameter" href="#feature_fraction_seed">🔗︎</a>`, default = ``2``, type = int
- ``skip_drop`` :raw-html:`<a id="skip_drop" title="Permalink to this parameter" href="#skip_drop">🔗︎</a>`, default = ``0.5``, type = double, constraints: ``0.0 <= skip_drop <= 1.0``
- used only in ``dart``
- probability of skipping drop
- ``xgboost_dart_mode``, default = ``false``, type = bool
- ``xgboost_dart_mode`` :raw-html:`<a id="xgboost_dart_mode" title="Permalink to this parameter" href="#xgboost_dart_mode">🔗︎</a>`, default = ``false``, type = bool
- used only in ``dart``
- set this to ``true``, if you want to use xgboost dart mode
- ``uniform_drop``, default = ``false``, type = bool
- ``uniform_drop`` :raw-html:`<a id="uniform_drop" title="Permalink to this parameter" href="#uniform_drop">🔗︎</a>`, default = ``false``, type = bool
- used only in ``dart``
- set this to ``true``, if you want to use uniform drop
- ``drop_seed``, default = ``4``, type = int
- ``drop_seed`` :raw-html:`<a id="drop_seed" title="Permalink to this parameter" href="#drop_seed">🔗︎</a>`, default = ``4``, type = int
- ``monotone_constraints`` :raw-html:`<a id="monotone_constraints" title="Permalink to this parameter" href="#monotone_constraints">🔗︎</a>`, default = ``None``, type = multi-int, aliases: ``mc``, ``monotone_constraint``
- used for constraints of monotonic features
...
...
@@ -347,13 +350,13 @@ Learning Control Parameters
- you need to specify all features in order. For example, ``mc=-1,0,1`` means decreasing for 1st feature, non-constraint for 2nd feature and increasing for the 3rd feature
- ``bin_construct_sample_cnt`` :raw-html:`<a id="bin_construct_sample_cnt" title="Permalink to this parameter" href="#bin_construct_sample_cnt">🔗︎</a>`, default = ``200000``, type = int, aliases: ``subsample_for_bin``, constraints: ``bin_construct_sample_cnt > 0``
- number of data that sampled to construct histogram bins
...
...
@@ -394,23 +397,23 @@ IO Parameters
- set this to larger value if data is very sparse
- ``histogram_pool_size``, default = ``-1.0``, type = double
- ``histogram_pool_size`` :raw-html:`<a id="histogram_pool_size" title="Permalink to this parameter" href="#histogram_pool_size">🔗︎</a>`, default = ``-1.0``, type = double
- max cache size in MB for historical histogram
- ``< 0`` means no limit
- ``data_random_seed``, default = ``1``, type = int
- ``data_random_seed`` :raw-html:`<a id="data_random_seed" title="Permalink to this parameter" href="#data_random_seed">🔗︎</a>`, default = ``1``, type = int
- random seed for data partition in parallel learning (excluding the ``feature_parallel`` mode)
- ``output_model`` :raw-html:`<a id="output_model" title="Permalink to this parameter" href="#output_model">🔗︎</a>`, default = ``LightGBM_model.txt``, type = string, aliases: ``model_output``, ``model_out``
- filename of output model in training
- **Note**: can be used only in CLI version
- ``snapshot_freq``, default = ``-1``, type = int
- ``snapshot_freq`` :raw-html:`<a id="snapshot_freq" title="Permalink to this parameter" href="#snapshot_freq">🔗︎</a>`, default = ``-1``, type = int
- ``enable_bundle`` :raw-html:`<a id="enable_bundle" title="Permalink to this parameter" href="#enable_bundle">🔗︎</a>`, default = ``true``, type = bool, aliases: ``is_enable_bundle``, ``bundle``
- set this to ``false`` to disable Exclusive Feature Bundling (EFB), which is described in `LightGBM: A Highly Efficient Gradient Boosting Decision Tree <https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree>`__
- **Note**: disabling this may cause the slow training speed for sparse datasets
- ``sparse_threshold`` :raw-html:`<a id="sparse_threshold" title="Permalink to this parameter" href="#sparse_threshold">🔗︎</a>`, default = ``0.8``, type = double, constraints: ``0.0 < sparse_threshold <= 1.0``
- the threshold of zero elements precentage for treating a feature as a sparse one
- ``use_missing``, default = ``true``, type = bool
- ``use_missing`` :raw-html:`<a id="use_missing" title="Permalink to this parameter" href="#use_missing">🔗︎</a>`, default = ``true``, type = bool
- set this to ``false`` to disable the special handle of missing value
- ``zero_as_missing``, default = ``false``, type = bool
- ``zero_as_missing`` :raw-html:`<a id="zero_as_missing" title="Permalink to this parameter" href="#zero_as_missing">🔗︎</a>`, default = ``false``, type = bool
- set this to ``true`` to treat all zero as missing values (including the unshown values in libsvm/sparse matrics)
- set this to ``false`` to use ``na`` for representing missing values
- ``two_round`` :raw-html:`<a id="two_round" title="Permalink to this parameter" href="#two_round">🔗︎</a>`, default = ``false``, type = bool, aliases: ``two_round_loading``, ``use_two_round_loading``
- set this to ``true`` if data file is too big to fit in memory
- by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big
- ``enable_load_from_binary_file`` :raw-html:`<a id="enable_load_from_binary_file" title="Permalink to this parameter" href="#enable_load_from_binary_file">🔗︎</a>`, default = ``true``, type = bool, aliases: ``load_from_binary_file``, ``binary_load``, ``load_binary``
- set this to ``true`` to enable autoloading from previous saved binary datasets
- set this to ``false`` to ignore binary datasets
- ``header``, default = ``false``, type = bool, aliases: ``has_header``
- ``header`` :raw-html:`<a id="header" title="Permalink to this parameter" href="#header">🔗︎</a>`, default = ``false``, type = bool, aliases: ``has_header``
- set this to ``true`` if input data has header
- ``label_column``, default = ``""``, type = int or string, aliases: ``label``
- ``label_column`` :raw-html:`<a id="label_column" title="Permalink to this parameter" href="#label_column">🔗︎</a>`, default = ``""``, type = int or string, aliases: ``label``
- used to specify the label column
...
...
@@ -518,7 +521,7 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``label=name:is_click``
- ``weight_column``, default = ``""``, type = int or string, aliases: ``weight``
- ``weight_column`` :raw-html:`<a id="weight_column" title="Permalink to this parameter" href="#weight_column">🔗︎</a>`, default = ``""``, type = int or string, aliases: ``weight``
- used to specify the weight column
...
...
@@ -528,7 +531,7 @@ IO Parameters
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
- ``group_column``, default = ``""``, type = int or string, aliases: ``group``, ``group_id``, ``query_column``, ``query``, ``query_id``
- ``group_column`` :raw-html:`<a id="group_column" title="Permalink to this parameter" href="#group_column">🔗︎</a>`, default = ``""``, type = int or string, aliases: ``group``, ``group_id``, ``query_column``, ``query``, ``query_id``
- used to specify the query/group id column
...
...
@@ -540,7 +543,7 @@ IO Parameters
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
- ``ignore_column``, default = ``""``, type = multi-int or string, aliases: ``ignore_feature``, ``blacklist``
- ``ignore_column`` :raw-html:`<a id="ignore_column" title="Permalink to this parameter" href="#ignore_column">🔗︎</a>`, default = ``""``, type = multi-int or string, aliases: ``ignore_feature``, ``blacklist``
- used to specify some ignoring columns in training
...
...
@@ -552,7 +555,7 @@ IO Parameters
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
- ``categorical_feature``, default = ``""``, type = multi-int or string, aliases: ``cat_feature``, ``categorical_column``, ``cat_column``
- ``categorical_feature`` :raw-html:`<a id="categorical_feature" title="Permalink to this parameter" href="#categorical_feature">🔗︎</a>`, default = ``""``, type = multi-int or string, aliases: ``cat_feature``, ``categorical_column``, ``cat_column``
- used to specify categorical features
...
...
@@ -568,7 +571,7 @@ IO Parameters
- **Note**: the negative values will be treated as **missing values**
- ``predict_contrib`` :raw-html:`<a id="predict_contrib" title="Permalink to this parameter" href="#predict_contrib">🔗︎</a>`, default = ``false``, type = bool, aliases: ``is_predict_contrib``, ``contrib``
- used only in ``prediction`` task
...
...
@@ -590,7 +593,7 @@ IO Parameters
- produces ``#features + 1`` values where the last value is the expected value of the model output over the training data
- ``num_iteration_predict``, default = ``-1``, type = int
- ``num_iteration_predict`` :raw-html:`<a id="num_iteration_predict" title="Permalink to this parameter" href="#num_iteration_predict">🔗︎</a>`, default = ``-1``, type = int
- used only in ``prediction`` task
...
...
@@ -598,25 +601,25 @@ IO Parameters
- ``<= 0`` means no limit
- ``pred_early_stop``, default = ``false``, type = bool
- ``pred_early_stop`` :raw-html:`<a id="pred_early_stop" title="Permalink to this parameter" href="#pred_early_stop">🔗︎</a>`, default = ``false``, type = bool
- used only in ``prediction`` task
- if ``true``, will use early-stopping to speed up the prediction. May affect the accuracy
- ``pred_early_stop_freq``, default = ``10``, type = int
- ``pred_early_stop_freq`` :raw-html:`<a id="pred_early_stop_freq" title="Permalink to this parameter" href="#pred_early_stop_freq">🔗︎</a>`, default = ``10``, type = int
- used only in ``prediction`` task
- the frequency of checking early-stopping prediction
- ``pred_early_stop_margin``, default = ``10.0``, type = double
- ``pred_early_stop_margin`` :raw-html:`<a id="pred_early_stop_margin" title="Permalink to this parameter" href="#pred_early_stop_margin">🔗︎</a>`, default = ``10.0``, type = double
- used only in ``prediction`` task
- the threshold of margin in early-stopping prediction
- ``convert_model_language``, default = ``""``, type = string
- ``convert_model_language`` :raw-html:`<a id="convert_model_language" title="Permalink to this parameter" href="#convert_model_language">🔗︎</a>`, default = ``""``, type = string
- used only in ``convert_model`` task
...
...
@@ -626,7 +629,7 @@ IO Parameters
- **Note**: can be used only in CLI version
- ``convert_model``, default = ``gbdt_prediction.cpp``, type = string, aliases: ``convert_model_file``
- ``convert_model`` :raw-html:`<a id="convert_model" title="Permalink to this parameter" href="#convert_model">🔗︎</a>`, default = ``gbdt_prediction.cpp``, type = string, aliases: ``convert_model_file``
- ``sigmoid`` :raw-html:`<a id="sigmoid" title="Permalink to this parameter" href="#sigmoid">🔗︎</a>`, default = ``1.0``, type = double, constraints: ``sigmoid > 0.0``
- used only in ``binary`` and ``multiclassova`` classification and in ``lambdarank`` applications
- parameter for the sigmoid function
- ``boost_from_average``, default = ``true``, type = bool
- ``boost_from_average`` :raw-html:`<a id="boost_from_average" title="Permalink to this parameter" href="#boost_from_average">🔗︎</a>`, default = ``true``, type = bool
- used only in ``regression``, ``binary`` and ``cross-entropy`` applications
- adjusts initial score to the mean of labels for faster convergence
- ``reg_sqrt``, default = ``false``, type = bool
- ``reg_sqrt`` :raw-html:`<a id="reg_sqrt" title="Permalink to this parameter" href="#reg_sqrt">🔗︎</a>`, default = ``false``, type = bool
- ``alpha`` :raw-html:`<a id="alpha" title="Permalink to this parameter" href="#alpha">🔗︎</a>`, default = ``0.9``, type = double, constraints: ``alpha > 0.0``
- used only in ``huber`` and ``quantile`` ``regression`` applications
- parameter for `Huber loss <https://en.wikipedia.org/wiki/Huber_loss>`__ and `Quantile regression <https://en.wikipedia.org/wiki/Quantile_regression>`__
- ``max_position`` :raw-html:`<a id="max_position" title="Permalink to this parameter" href="#max_position">🔗︎</a>`, default = ``20``, type = int, constraints: ``max_position > 0``
- used only in ``lambdarank`` application
- optimizes `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__ at this position
- ``label_gain``, default = ``0,1,3,7,15,31,63,...,2^30-1``, type = multi-double
- ``label_gain`` :raw-html:`<a id="label_gain" title="Permalink to this parameter" href="#label_gain">🔗︎</a>`, default = ``0,1,3,7,15,31,63,...,2^30-1``, type = multi-double
- ``machines`` :raw-html:`<a id="machines" title="Permalink to this parameter" href="#machines">🔗︎</a>`, default = ``""``, type = string, aliases: ``workers``, ``nodes``
- list of machines in the following format: ``ip1:port1,ip2:port2``
GPU Parameters
--------------
- ``gpu_platform_id``, default = ``-1``, type = int
- ``gpu_platform_id`` :raw-html:`<a id="gpu_platform_id" title="Permalink to this parameter" href="#gpu_platform_id">🔗︎</a>`, default = ``-1``, type = int
- OpenCL platform ID. Usually each GPU vendor exposes one OpenCL platform
- ``-1`` means the system-wide default platform
- ``gpu_device_id``, default = ``-1``, type = int
- ``gpu_device_id`` :raw-html:`<a id="gpu_device_id" title="Permalink to this parameter" href="#gpu_device_id">🔗︎</a>`, default = ``-1``, type = int
- OpenCL device ID in the specified platform. Each GPU in the selected platform has a unique device ID
- ``-1`` means the default device in the selected platform
- ``gpu_use_dp``, default = ``false``, type = bool
- ``gpu_use_dp`` :raw-html:`<a id="gpu_use_dp" title="Permalink to this parameter" href="#gpu_use_dp">🔗︎</a>`, default = ``false``, type = bool
- set this to ``true`` to use double precision math on GPU (by default single precision is used)
...
...
@@ -855,9 +858,12 @@ LightGBM supports continued training with initial scores. It uses an additional
It means the initial score of the first data row is ``0.5``, second is ``-0.1``, and so on.
The initial score file corresponds with data file line by line, and has per score per line.
And if the name of data file is ``train.txt``, the initial score file should be named as ``train.txt.init`` and in the same folder as the data file.
In this case LightGBM will auto load initial score file if it exists.
Otherwise, you should specify the path to the custom named file with initial scores by the ``initscore_filename`` `parameter <#initscore_filename>`__.
Weight Data
~~~~~~~~~~~
...
...
@@ -872,10 +878,11 @@ LightGBM supports weighted training. It uses an additional file to store weight
It means the weight of the first data row is ``1.0``, second is ``0.5``, and so on.
The weight file corresponds with data file line by line, and has per weight per line.
And if the name of data file is ``train.txt``, the weight file should be named as ``train.txt.weight`` and placed in the same folder as the data file.
In this case LightGBM will load the weight file automatically if it exists.
Also, you can include weight column in your data file. Please refer to parameter ``weight`` in above.
Also, you can include weight column in your data file. Please refer to the ``weight_column`` `parameter <#weight_column>`__ in above.
Query Data
~~~~~~~~~~
...
...
@@ -897,6 +904,6 @@ It means first ``27`` lines samples belong to one query and next ``18`` lines be
If the name of data file is ``train.txt``, the query file should be named as ``train.txt.query`` and placed in the same folder as the data file.
In this case LightGBM will load the query file automatically if it exists.
Also, you can include query/group id column in your data file. Please refer to parameter ``group`` in above.
Also, you can include query/group id column in your data file. Please refer to the ``group_column`` `parameter <#group_column>`__ in above.
main_desc='- ``{0}``, default = ``{1}``, type = {2}{3}{4}{5}'.format(name,default,param_type,options_str,aliases_str,checks_str)
main_desc='- ``{0}`` :raw-html:`<a id="{0}" title="Permalink to this parameter" href="#{0}">🔗︎</a>`, default = ``{1}``, type = {2}{3}{4}{5}'.format(name,default,param_type,options_str,aliases_str,checks_str)