Commit e5eb8560 authored by Nikita Titov's avatar Nikita Titov Committed by Guolin Ke
Browse files

[python] [docs] fixed objective in sklearn wrapper; added missed objectives &...

[python] [docs] fixed objective in sklearn wrapper; added missed objectives & metrics to docs (#1059)

* added missed aliases for task parameter

* fixed indents

* added missed aliases and options for tree_learner parameter

* added missed objectives to docs

* fixed typo in Poisson parameter and its description

* fixed model_format parameter description

* added missed metrics to docs

* fixed sklearn objective

* fixed set_params

* fixed docs

* added missed options to objectives

* added note about ignore_column (#1061)
parent 3d65d065
......@@ -39,22 +39,22 @@ Core Parameters
- path of config file
- ``task``, default=\ ``train``, type=enum, options=\ ``train``, ``prediction``
- ``task``, default=\ ``train``, type=enum, options=\ ``train``, ``predict``, ``convert_model``
- ``train`` for training
- ``train``, alias=\ ``training``, for training
- ``prediction`` for prediction.
- ``predict``, alias=\ ``prediction``, ``test``, for prediction.
- ``convert_model`` for converting model file into if-else format, see more information in `Convert model parameters <#convert-model-parameters>`__
- ``convert_model``, for converting model file into if-else format, see more information in `Convert model parameters <#convert-model-parameters>`__
- ``application``, default=\ ``regression``, type=enum,
options=\ ``regression``, ``regression_l2``, ``regression_l1``, ``huber``, ``fair``, ``poisson``, ``quantile``, ``quantile_l2``,
``binary``, ``lambdarank``, ``multiclass``,
options=\ ``regression``, ``regression_l1``, ``huber``, ``fair``, ``poisson``, ``quantile``, ``quantile_l2``,
``binary``, ``multiclass``, ``multiclassova``, ``xentropy``, ``xentlambda``, ``lambdarank``,
alias=\ ``objective``, ``app``
- ``regression``, regression application
- regression application
- ``regression_l2``, L2 loss, alias=\ ``mean_squared_error``, ``mse``
- ``regression_l2``, L2 loss, alias=\ ``regression``, ``mean_squared_error``, ``mse``
- ``regression_l1``, L1 loss, alias=\ ``mean_absolute_error``, ``mae``
......@@ -68,7 +68,21 @@ Core Parameters
- ``quantile_l2``, like the ``quantile``, but L2 loss is used instead
- ``binary``, binary classification application
- ``binary``, binary `log loss`_ classification application
- multi-class classification application
- ``multiclass``, `softmax`_ objective function, ``num_class`` should be set as well
- ``multiclassova``, `One-vs-All`_ binary objective function, ``num_class`` should be set as well
- cross-entropy application
- ``xentropy``, objective function for cross-entropy (with optional linear weights), alias=\ ``cross_entropy``
- ``xentlambda``, alternative parameterization of cross-entropy, alias=\ ``cross_entropy_lambda``
- the label is anything in interval [0, 1]
- ``lambdarank``, `lambdarank`_ application
......@@ -76,8 +90,6 @@ Core Parameters
- ``label_gain`` can be used to set the gain(weight) of ``int`` label
- ``multiclass``, multi-class classification application, ``num_class`` should be set as well
- ``boosting``, default=\ ``gbdt``, type=enum,
options=\ ``gbdt``, ``rf``, ``dart``, ``goss``,
alias=\ ``boost``, ``boosting_type``
......@@ -120,13 +132,15 @@ Core Parameters
- number of leaves in one tree
- ``tree_learner``, default=\ ``serial``, type=enum, options=\ ``serial``, ``feature``, ``data``, alias=\ ``tree``
- ``tree_learner``, default=\ ``serial``, type=enum, options=\ ``serial``, ``feature``, ``data``, ``voting``, alias=\ ``tree``
- ``serial``, single machine tree learner
- ``feature``, feature parallel tree learner
- ``feature``, alias=\ ``feature_parallel``, feature parallel tree learner
- ``data``, data parallel tree learner
- ``data``, alias=\ ``data_parallel``, data parallel tree learner
- ``voting``, alias=\ ``voting_parallel``, voting parallel tree learner
- refer to `Parallel Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details
......@@ -321,7 +335,7 @@ IO Parameters
- file name of prediction result in ``prediction`` task
- ``model_format``, default=\ ``text``, type=string
- ``model_format``, default=\ ``text``, type=multi-enum, options=\ ``text``, ``proto``
- format to save and load model
......@@ -406,6 +420,8 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``ignore_column=name:c1,c2,c3`` means c1, c2 and c3 will be ignored
- **Note**: works only in CLI-version
- **Note**: index starts from ``0``. And it doesn't count the label column
- ``categorical_feature``, default=\ ``""``, type=string, alias=\ ``categorical_column``, ``cat_feature``, ``cat_column``
......@@ -507,9 +523,9 @@ Objective Parameters
- parameter to control the width of Gaussian function. Will be used in ``regression_l1`` and ``huber`` losses
- ``poission_max_delta_step``, default=\ ``0.7``, type=double
- ``poisson_max_delta_step``, default=\ ``0.7``, type=double
- parameter used to safeguard optimization
- parameter for `Poisson regression`_ to safeguard optimization
- ``scale_pos_weight``, default=\ ``1.0``, type=double
......@@ -579,13 +595,18 @@ Metric Parameters
- ``binary_logloss``, `log loss`_
- ``binary_error``.
For one sample: ``0`` for correct classification, ``1`` for error classification
- ``binary_error``, for one sample: ``0`` for correct classification, ``1`` for error classification
- ``multi_logloss``, log loss for mulit-class classification
- ``multi_error``, error rate for mulit-class classification
- ``xentropy``, cross-entropy (with optional linear weights), alias=\ ``cross_entropy``
- ``xentlambda``, "intensity-weighted" cross-entropy, alias=\ ``cross_entropy_lambda``
- ``kldiv``, `Kullback-Leibler divergence`_, alias=\ ``kullback_leibler``
- support multi metrics, separated by ``,``
- ``metric_freq``, default=\ ``1``, type=int
......@@ -749,3 +770,9 @@ You can specific query/group id in data file now. Please refer to parameter ``gr
.. _AUC: https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve
.. _log loss: https://www.kaggle.com/wiki/LogLoss
.. _softmax: https://en.wikipedia.org/wiki/Softmax_function
.. _One-vs-All: https://en.wikipedia.org/wiki/Multiclass_classification#One-vs.-rest
.. _Kullback-Leibler divergence: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
......@@ -163,7 +163,7 @@ class LGBMModel(_LGBMModelBase):
objective : string, callable or None, optional (default=None)
Specify the learning task and the corresponding learning objective or
a custom objective function to be used (see note below).
default: 'binary' for LGBMClassifier, 'lambdarank' for LGBMRanker.
default: 'regression' for LGBMRegressor, 'binary' or 'multiclass' for LGBMClassifier, 'lambdarank' for LGBMRanker.
min_split_gain : float, optional (default=0.)
Minimum loss reduction required to make a further partition on a leaf node of the tree.
min_child_weight : float, optional (default=1e-3)
......@@ -264,7 +264,7 @@ class LGBMModel(_LGBMModelBase):
self._best_score = None
self._best_iteration = None
self._other_params = {}
self._objective = None
self._objective = objective
self._n_features = None
self._classes = None
self._n_classes = None
......@@ -285,6 +285,8 @@ class LGBMModel(_LGBMModelBase):
def set_params(self, **params):
for key, value in params.items():
setattr(self, key, value)
if hasattr(self, '_' + key):
setattr(self, '_' + key, value)
self._other_params[key] = value
return self
......@@ -370,8 +372,6 @@ class LGBMModel(_LGBMModelBase):
For multi-class task, the y_pred is group by class_id first, then group by row_id.
If you want to get i-th row y_pred in j-th class, the access way is y_pred[j * num_data + i].
"""
if not hasattr(self, '_objective'):
self._objective = self.objective
if self._objective is None:
if isinstance(self, LGBMRegressor):
self._objective = "regression"
......@@ -633,6 +633,7 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
self._n_classes = len(self._classes)
if self._n_classes > 2:
# Switch to using a multiclass objective in the underlying LGBM instance
if self._objective != "multiclassova" and not callable(self._objective):
self._objective = "multiclass"
if eval_metric == 'logloss' or eval_metric == 'binary_logloss':
eval_metric = "multi_logloss"
......
......@@ -39,7 +39,7 @@ Metric* Metric::CreateMetric(const std::string& type, const MetricConfig& config
return new MultiErrorMetric(config);
} else if (type == std::string("xentropy") || type == std::string("cross_entropy")) {
return new CrossEntropyMetric(config);
} else if (type == std::string("xentlambda")) {
} else if (type == std::string("xentlambda") || type == std::string("cross_entropy_lambda")) {
return new CrossEntropyLambdaMetric(config);
} else if (type == std::string("kldiv") || type == std::string("kullback_leibler")) {
return new KullbackLeiblerDivergence(config);
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment