various improvements around metric param and early_stopping_rounds param description (#1589)

* bring consistency and clearness into early_stopping_rounds desc, metric desc and implementation * hotfix * hotfix * used NDCG as default metric for lambdarank task * fixed missed methods at ReadTheDocs and changed default eval_metric * leaved only unique metrics * fixed comment

various improvements around metric param and early_stopping_rounds param description (#1589)
* bring consistency and clearness into early_stopping_rounds desc, metric desc and implementation * hotfix * hotfix * used NDCG as default metric for lambdarank task * fixed missed methods at ReadTheDocs and changed default eval_metric * leaved only unique metrics * fixed comment
cd6d0583 · Nikita Titov · GitHub · c77153a1 · cd6d0583 · cd6d0583
Unverified Commit cd6d0583 authored Aug 27, 2018 by Nikita Titov Committed by GitHub Aug 27, 2018
12 changed files
--- a/R-package/R/lgb.cv.R
+++ b/R-package/R/lgb.cv.R
@@ -49,9 +49,10 @@ CVBooster <- R6Class(
 #'        type str represents feature names
 #' @param early_stopping_rounds int
 #'        Activates early stopping.
-#'        Requires at least one validation data and one metric
+#'        CV score needs to improve at least every early_stopping_rounds round(s) to continue.
-#'        If there's more than one, will check all of them except the training data
+#'        Requires at least one metric.
-#'        Returns the model with (best_iter + early_stopping_rounds)
+#'        If there's more than one, will check all of them.
+#'        Returns the model with (best_iter + early_stopping_rounds).
 #'        If early stopping occurs, the model will have 'best_iter' field
 #' @param callbacks list of callback functions
 #'        List of callback functions that are applied at each iteration.

--- a/R-package/R/lgb.train.R
+++ b/R-package/R/lgb.train.R
@@ -23,14 +23,16 @@
 #'        type str represents feature names
 #' @param early_stopping_rounds int
 #'        Activates early stopping.
-#'        Requires at least one validation data and one metric
+#'        The model will train until the validation score stops improving.
-#'        If there's more than one, will check all of them except the training data
+#'        Validation score needs to improve at least every early_stopping_rounds round(s) to continue training.
-#'        Returns the model with (best_iter + early_stopping_rounds)
+#'        Requires at least one validation data and one metric.
+#'        If there's more than one, will check all of them. But the training data is ignored anyway.
+#'        Returns the model with (best_iter + early_stopping_rounds).
 #'        If early stopping occurs, the model will have 'best_iter' field
 #' @param reset_data Boolean, setting it to TRUE (not the default value) will transform the booster model into a predictor model which frees up memory and the original datasets
 #' @param callbacks list of callback functions
 #'        List of callback functions that are applied at each iteration.
-#' @param ... other parameters, see Parameters.rst for more informations
+#' @param ... other parameters, see Parameters.rst for more information
 #' 
 #' @return a trained booster model \code{lgb.Booster}.
 #' 

--- a/R-package/man/lgb.train.Rd
+++ b/R-package/man/lgb.train.Rd
@@ -67,9 +67,10 @@ type str represents feature names}
 \item{early_stopping_rounds}{int
 Activates early stopping.
-Requires at least one validation data and one metric
+CV score needs to improve at least every early_stopping_rounds round(s) to continue.
-If there's more than one, will check all of them except the training data
+Requires at least one metric.
-Returns the model with (best_iter + early_stopping_rounds)
+If there's more than one, will check all of them.
+Returns the model with (best_iter + early_stopping_rounds).
 If early stopping occurs, the model will have 'best_iter' field}
 \item{callbacks}{list of callback functions
@@ -127,9 +128,11 @@ type str represents feature names}
 \item{early_stopping_rounds}{int
 Activates early stopping.
-Requires at least one validation data and one metric
+The model will train until the validation score stops improving.
-If there's more than one, will check all of them except the training data
+Validation score needs to improve at least every early_stopping_rounds round(s) to continue training.
-Returns the model with (best_iter + early_stopping_rounds)
+Requires at least one validation data and one metric.
+If there's more than one, will check all of them. But the training data is ignored anyway.
+Returns the model with (best_iter + early_stopping_rounds).
 If early stopping occurs, the model will have 'best_iter' field}
 \item{callbacks}{list of callback functions

--- a/docs/Parameters.rst
+++ b/docs/Parameters.rst
@@ -731,7 +731,7 @@ Metric Parameters
 -  ``metric`` :raw-html:`<a id="metric" title="Permalink to this parameter" href="#metric">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = multi-enum, aliases: ``metrics``, ``metric_types``
-   -  metric(s) to be evaluated on the evaluation sets **in addition** to what is provided in the training arguments
+   -  metric(s) to be evaluated on the evaluation set(s)
      -  ``""`` (empty string or not specified) means that metric corresponding to specified ``objective`` will be used (this is possible only for pre-defined objective functions, otherwise no evaluation metric will be added)
@@ -759,7 +759,7 @@ Metric Parameters
      -  ``tweedie``, negative log-likelihood for **Tweedie** regression
-      -  ``ndcg``, `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__
+      -  ``ndcg``, `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__, aliases: ``lambdarank``
      -  ``map``, `MAP <https://makarandtapaswi.wordpress.com/2012/07/02/intuition-behind-average-precision-and-map/>`__, aliases: ``mean_average_precision``

--- a/docs/Python-API.rst
+++ b/docs/Python-API.rst
@@ -26,18 +26,22 @@ Scikit-learn API
 .. autoclass:: lightgbm.LGBMModel
    :members:
+    :inherited-members:
    :show-inheritance:
 .. autoclass:: lightgbm.LGBMClassifier
    :members:
+    :inherited-members:
    :show-inheritance:
 .. autoclass:: lightgbm.LGBMRegressor
    :members:
+    :inherited-members:
    :show-inheritance:
 .. autoclass:: lightgbm.LGBMRanker
    :members:
+    :inherited-members:
    :show-inheritance:

--- a/docs/Python-Intro.rst
+++ b/docs/Python-Intro.rst
@@ -193,13 +193,13 @@ Early stopping requires at least one set in ``valid_sets``. If there is more tha
    bst.save_model('model.txt', num_iteration=bst.best_iteration)
 The model will train until the validation score stops improving.
-Validation error needs to improve at least every ``early_stopping_rounds`` to continue training.
+Validation score needs to improve at least every ``early_stopping_rounds`` to continue training.
 If early stopping occurs, the model will have an additional field: ``bst.best_iteration``.
 Note that ``train()`` will return a model from the best iteration.
-This works with both metrics to minimize (L2, log loss, etc.) and to maximize (NDCG, AUC).
+This works with both metrics to minimize (L2, log loss, etc.) and to maximize (NDCG, AUC, etc.).
-Note that if you specify more than one evaluation metric, all of them except the training data will be used for early stopping.
+Note that if you specify more than one evaluation metric, all of them will be used for early stopping.
 Prediction
 ----------

--- a/include/LightGBM/config.h
+++ b/include/LightGBM/config.h
@@ -666,7 +666,7 @@ public:
  // alias = metrics, metric_types
  // default = ""
  // type = multi-enum
-  // desc = metric(s) to be evaluated on the evaluation sets **in addition** to what is provided in the training arguments
+  // desc = metric(s) to be evaluated on the evaluation set(s)
  // descl2 = ``""`` (empty string or not specified) means that metric corresponding to specified ``objective`` will be used (this is possible only for pre-defined objective functions, otherwise no evaluation metric will be added)
  // descl2 = ``"None"`` (string, **not** a ``None`` value) means that no metric will be registered, aliases: ``na``, ``null``, ``custom``
  // descl2 = ``l1``, absolute loss, aliases: ``mean_absolute_error``, ``mae``, ``regression_l1``
@@ -680,7 +680,7 @@ public:
  // descl2 = ``gamma``, negative log-likelihood for **Gamma** regression
  // descl2 = ``gamma_deviance``, residual deviance for **Gamma** regression
  // descl2 = ``tweedie``, negative log-likelihood for **Tweedie** regression
-  // descl2 = ``ndcg``, `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__
+  // descl2 = ``ndcg``, `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__, aliases: ``lambdarank``
  // descl2 = ``map``, `MAP <https://makarandtapaswi.wordpress.com/2012/07/02/intuition-behind-average-precision-and-map/>`__, aliases: ``mean_average_precision``
  // descl2 = ``auc``, `AUC <https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve>`__
  // descl2 = ``binary_logloss``, `log loss <https://en.wikipedia.org/wiki/Cross_entropy>`__, aliases: ``binary``

--- a/python-package/lightgbm/callback.py
+++ b/python-package/lightgbm/callback.py
@@ -153,8 +153,11 @@ def early_stopping(stopping_rounds, verbose=True):
    Note
    ----
    Activates early stopping.
+    The model will train until the validation score stops improving.
+    Validation score needs to improve at least every ``early_stopping_rounds`` round(s)
+    to continue training.
    Requires at least one validation data and one metric.
-    If there's more than one, will check all of them except the training data.
+    If there's more than one, will check all of them. But the training data is ignored anyway.
    Parameters
    ----------

--- a/python-package/lightgbm/engine.py
+++ b/python-package/lightgbm/engine.py
@@ -60,7 +60,10 @@ def train(params, train_set, num_boost_round=100,
        All negative values in categorical features will be treated as missing values.
    early_stopping_rounds: int or None, optional (default=None)
        Activates early stopping. The model will train until the validation score stops improving.
-        Requires at least one validation data and one metric. If there's more than one, will check all of them except the training data.
+        Validation score needs to improve at least every ``early_stopping_rounds`` round(s)
+        to continue training.
+        Requires at least one validation data and one metric.
+        If there's more than one, will check all of them. But the training data is ignored anyway.
        If early stopping occurs, the model will add ``best_iteration`` field.
    evals_result: dict or None, optional (default=None)
        This dictionary used to store all evaluation results of all the items in ``valid_sets``.
@@ -363,8 +366,10 @@ def cv(params, train_set, num_boost_round=100,
        All values in categorical features should be less than int32 max value (2147483647).
        All negative values in categorical features will be treated as missing values.
    early_stopping_rounds: int or None, optional (default=None)
-        Activates early stopping. CV error needs to decrease at least
+        Activates early stopping.
-        every ``early_stopping_rounds`` round(s) to continue.
+        CV score needs to improve at least every ``early_stopping_rounds`` round(s)
+        to continue.
+        Requires at least one metric. If there's more than one, will check all of them.
        Last entry in evaluation history is the one from best iteration.
    fpreproc : callable or None, optional (default=None)
        Preprocessing function that takes (dtrain, dtest, params)

--- a/python-package/lightgbm/sklearn.py
+++ b/python-package/lightgbm/sklearn.py
@@ -11,7 +11,7 @@ from .compat import (SKLEARN_INSTALLED, _LGBMClassifierBase,
                     LGBMNotFittedError, _LGBMLabelEncoder, _LGBMModelBase,
                     _LGBMRegressorBase, _LGBMCheckXY, _LGBMCheckArray, _LGBMCheckConsistentLength,
                     _LGBMCheckClassificationTargets, _LGBMComputeSampleWeight,
-                     argc_, range_, DataFrame, LGBMDeprecationWarning)
+                     argc_, range_, string_type, DataFrame, LGBMDeprecationWarning)
 from .engine import train
@@ -160,7 +160,7 @@ class LGBMModel(_LGBMModelBase):
        objective : string, callable or None, optional (default=None)
            Specify the learning task and the corresponding learning objective or
            a custom objective function to be used (see note below).
-            default: 'regression' for LGBMRegressor, 'binary' or 'multiclass' for LGBMClassifier, 'lambdarank' for LGBMRanker.
+            Default: 'regression' for LGBMRegressor, 'binary' or 'multiclass' for LGBMClassifier, 'lambdarank' for LGBMRanker.
        class_weight : dict, 'balanced' or None, optional (default=None)
            Weights associated with classes in the form ``{class_label: weight}``.
            Use this parameter only for multi-class classification task;
@@ -316,7 +316,7 @@ class LGBMModel(_LGBMModelBase):
        group : array-like or None, optional (default=None)
            Group data of training data.
        eval_set : list or None, optional (default=None)
-            A list of (X, y) tuple pairs to use as a validation sets for early-stopping.
+            A list of (X, y) tuple pairs to use as a validation sets.
        eval_names : list of strings or None, optional (default=None)
            Names of eval_set.
        eval_sample_weight : list of arrays or None, optional (default=None)
@@ -329,13 +329,15 @@ class LGBMModel(_LGBMModelBase):
            Group data of eval data.
        eval_metric : string, list of strings, callable or None, optional (default=None)
            If string, it should be a built-in evaluation metric to use.
-            If callable, it should be a custom evaluation metric, see note for more details.
+            If callable, it should be a custom evaluation metric, see note below for more details.
            In either case, the ``metric`` from the model parameters will be evaluated and used as well.
+            Default: 'l2' for LGBMRegressor, 'logloss' for LGBMClassifier, 'ndcg' for LGBMRanker.
        early_stopping_rounds : int or None, optional (default=None)
            Activates early stopping. The model will train until the validation score stops improving.
-            If there's more than one, will check all of them except the training data.
+            Validation score needs to improve at least every ``early_stopping_rounds`` round(s)
-            Validation error needs to decrease at least every ``early_stopping_rounds`` round(s)
            to continue training.
+            Requires at least one validation data and one metric.
+            If there's more than one, will check all of them. But the training data is ignored anyway.
        verbose : bool, optional (default=True)
            If True and an evaluation set is used, writes the evaluation progress.
        feature_name : list of strings or 'auto', optional (default="auto")
@@ -417,7 +419,24 @@ class LGBMModel(_LGBMModelBase):
            feval = _eval_function_wrapper(eval_metric)
        else:
            feval = None
-            params['metric'] = eval_metric
+            # register default metric for consistency with callable eval_metric case
+            original_metric = self._objective if isinstance(self._objective, string_type) else None
+            if original_metric is None:
+                # try to deduce from class instance
+                if isinstance(self, LGBMRegressor):
+                    original_metric = "l2"
+                elif isinstance(self, LGBMClassifier):
+                    original_metric = "multi_logloss" if self._n_classes > 2 else "binary_logloss"
+                elif isinstance(self, LGBMRanker):
+                    original_metric = "ndcg"
+            # overwrite default metric by explicitly set metric
+            for metric_alias in ['metric', 'metrics', 'metric_types']:
+                if metric_alias in params:
+                    original_metric = params.pop(metric_alias)
+            # concatenate metric from params (or default if not provided in params) and eval_metric
+            original_metric = [original_metric] if isinstance(original_metric, (string_type, type(None))) else original_metric
+            eval_metric = [eval_metric] if isinstance(eval_metric, (string_type, type(None))) else eval_metric
+            params['metric'] = set(original_metric + eval_metric)
        if not isinstance(X, DataFrame):
            X, y = _LGBMCheckXY(X, y, accept_sparse=True, force_all_finite=False, ensure_min_samples=2)
@@ -627,7 +646,7 @@ class LGBMRegressor(LGBMModel, _LGBMRegressorBase):
    def fit(self, X, y,
            sample_weight=None, init_score=None,
            eval_set=None, eval_names=None, eval_sample_weight=None,
-            eval_init_score=None, eval_metric="l2", early_stopping_rounds=None,
+            eval_init_score=None, eval_metric=None, early_stopping_rounds=None,
            verbose=True, feature_name='auto', categorical_feature='auto', callbacks=None):
        super(LGBMRegressor, self).fit(X, y, sample_weight=sample_weight,
@@ -645,10 +664,6 @@ class LGBMRegressor(LGBMModel, _LGBMRegressorBase):
    _base_doc = LGBMModel.fit.__doc__
    fit.__doc__ = (_base_doc[:_base_doc.find('eval_class_weight :')]
                   + _base_doc[_base_doc.find('eval_init_score :'):])
-    _base_doc = fit.__doc__
-    fit.__doc__ = (_base_doc[:_base_doc.find('eval_metric :')]
-                   + 'eval_metric : string, list of strings, callable or None, optional (default="l2")\n'
-                   + _base_doc[_base_doc.find('            If string, it should be a built-in evaluation metric to use.'):])
 class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
@@ -657,7 +672,7 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
    def fit(self, X, y,
            sample_weight=None, init_score=None,
            eval_set=None, eval_names=None, eval_sample_weight=None,
-            eval_class_weight=None, eval_init_score=None, eval_metric="logloss",
+            eval_class_weight=None, eval_init_score=None, eval_metric=None,
            early_stopping_rounds=None, verbose=True,
            feature_name='auto', categorical_feature='auto', callbacks=None):
        _LGBMCheckClassificationTargets(y)
@@ -703,10 +718,7 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
                                        callbacks=callbacks)
        return self
-    _base_doc = LGBMModel.fit.__doc__
+    fit.__doc__ = LGBMModel.fit.__doc__
-    fit.__doc__ = (_base_doc[:_base_doc.find('eval_metric :')]
-                   + 'eval_metric : string, list of strings, callable or None, optional (default="logloss")\n'
-                   + _base_doc[_base_doc.find('            If string, it should be a built-in evaluation metric to use.'):])
    def predict(self, X, raw_score=False, num_iteration=None,
                pred_leaf=False, pred_contrib=False, **kwargs):
@@ -718,6 +730,8 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
            class_index = np.argmax(result, axis=1)
            return self._le.inverse_transform(class_index)
+    predict.__doc__ = LGBMModel.predict.__doc__
    def predict_proba(self, X, raw_score=False, num_iteration=None,
                      pred_leaf=False, pred_contrib=False, **kwargs):
        """Return the predicted probability for each class for each sample.
@@ -775,7 +789,7 @@ class LGBMRanker(LGBMModel):
    def fit(self, X, y,
            sample_weight=None, init_score=None, group=None,
            eval_set=None, eval_names=None, eval_sample_weight=None,
-            eval_init_score=None, eval_group=None, eval_metric='ndcg',
+            eval_init_score=None, eval_group=None, eval_metric=None,
            eval_at=[1], early_stopping_rounds=None, verbose=True,
            feature_name='auto', categorical_feature='auto', callbacks=None):
        # check group data
@@ -809,9 +823,8 @@ class LGBMRanker(LGBMModel):
    fit.__doc__ = (_base_doc[:_base_doc.find('eval_class_weight :')]
                   + _base_doc[_base_doc.find('eval_init_score :'):])
    _base_doc = fit.__doc__
-    fit.__doc__ = (_base_doc[:_base_doc.find('eval_metric :')]
+    _before_early_stop, _early_stop, _after_early_stop = _base_doc.partition('early_stopping_rounds :')
-                   + 'eval_metric : string, list of strings, callable or None, optional (default="ndcg")\n'
+    fit.__doc__ = (_before_early_stop
-                   + _base_doc[_base_doc.find('            If string, it should be a built-in evaluation metric to use.'):_base_doc.find('early_stopping_rounds :')]
                   + 'eval_at : list of int, optional (default=[1])\n'
-                     '            The evaluation positions of the specified metric.\n'
+                   + ' ' * 12 + 'The evaluation positions of the specified metric.\n'
-                   + _base_doc[_base_doc.find('        early_stopping_rounds :'):])
+                   + ' ' * 8 + _early_stop + _after_early_stop)
--- a/src/metric/metric.cpp
+++ b/src/metric/metric.cpp
@@ -29,7 +29,7 @@ Metric* Metric::CreateMetric(const std::string& type, const Config& config) {
    return new BinaryErrorMetric(config);
  } else if (type == std::string("auc")) {
    return new AUCMetric(config);
-  } else if (type == std::string("ndcg")) {
+  } else if (type == std::string("ndcg") || type == std::string("lambdarank")) {
    return new NDCGMetric(config);
  } else if (type == std::string("map") || type == std::string("mean_average_precision")) {
    return new MapMetric(config);

--- a/src/objective/objective_function.cpp
+++ b/src/objective/objective_function.cpp
@@ -40,7 +40,7 @@ ObjectiveFunction* ObjectiveFunction::CreateObjectiveFunction(const std::string&
    return new RegressionGammaLoss(config);
  } else if (type == std::string("tweedie")) {
    return new RegressionTweedieLoss(config);
-  } else if (type == std::string("none") || type == std::string("null") || type == std::string("custom")) {
+  } else if (type == std::string("none") || type == std::string("null") || type == std::string("custom") || type == std::string("na")) {
    return nullptr;
  }
  Log::Fatal("Unknown objective type name: %s", type.c_str());
@@ -79,7 +79,7 @@ ObjectiveFunction* ObjectiveFunction::CreateObjectiveFunction(const std::string&
    return new RegressionGammaLoss(strs);
  } else if (type == std::string("tweedie")) {
    return new RegressionTweedieLoss(strs);
-  } else if (type == std::string("none") || type == std::string("null") || type == std::string("custom")) {
+  } else if (type == std::string("none") || type == std::string("null") || type == std::string("custom") || type == std::string("na")) {
    return nullptr;
  }
  Log::Fatal("Unknown objective type name: %s", type.c_str());