[docs][python] update some docs related to custom objective (#4245)

1a367c65 · Nikita Titov · GitHub · 26cde5f5 · 1a367c65 · 1a367c65
Unverified Commit 1a367c65 authored May 02, 2021 by Nikita Titov Committed by GitHub May 02, 2021
3 changed files
--- a/python-package/lightgbm/basic.py
+++ b/python-package/lightgbm/basic.py
@@ -2600,14 +2600,17 @@ class Booster:

                preds : list or numpy 1-D array
                    The predicted values.
+                    Predicted values are returned before any transformation,
+                    e.g. they are raw margin instead of probability of positive class for binary task.
                train_data : Dataset
                    The training dataset.
                grad : list or numpy 1-D array
-                    The value of the first order derivative (gradient) for each sample point.
+                    The value of the first order derivative (gradient) of the loss
+                    with respect to the elements of preds for each sample point.
                hess : list or numpy 1-D array
-                    The value of the second order derivative (Hessian) for each sample point.
+                    The value of the second order derivative (Hessian) of the loss
+                    with respect to the elements of preds for each sample point.

-            For binary task, the preds is probability of positive class (or margin in case of specified ``fobj``).
            For multi-class task, the preds is group by class_id first, then group by row_id.
            If you want to get i-th row preds in j-th class, the access way is score[j * num_data + i]
            and you should group grad and hess in this way as well.
@@ -2656,7 +2659,8 @@ class Booster:

        .. note::

-            For binary task, the score is probability of positive class (or margin in case of custom objective).
+            Score is returned before any transformation,
+            e.g. it is raw margin instead of probability of positive class for binary task.
            For multi-class task, the score is group by class_id first, then group by row_id.
            If you want to get i-th row score in j-th class, the access way is score[j * num_data + i]
            and you should group grad and hess in this way as well.
@@ -2664,9 +2668,11 @@ class Booster:
        Parameters
        ----------
        grad : list or numpy 1-D array
-            The first order derivative (gradient).
+            The value of the first order derivative (gradient) of the loss
+            with respect to the elements of score for each sample point.
        hess : list or numpy 1-D array
-            The second order derivative (Hessian).
+            The value of the second order derivative (Hessian) of the loss
+            with respect to the elements of score for each sample point.

        Returns
        -------
@@ -2788,6 +2794,8 @@ class Booster:

                preds : list or numpy 1-D array
                    The predicted values.
+                    If ``fobj`` is specified, predicted values are returned before any transformation,
+                    e.g. they are raw margin instead of probability of positive class for binary task in this case.
                eval_data : Dataset
                    The evaluation dataset.
                eval_name : string
@@ -2797,7 +2805,6 @@ class Booster:
                is_higher_better : bool
                    Is eval result higher better, e.g. AUC is ``is_higher_better``.

-            For binary task, the preds is probability of positive class (or margin in case of specified ``fobj``).
            For multi-class task, the preds is group by class_id first, then group by row_id.
            If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].

@@ -2835,6 +2842,8 @@ class Booster:

                preds : list or numpy 1-D array
                    The predicted values.
+                    If ``fobj`` is specified, predicted values are returned before any transformation,
+                    e.g. they are raw margin instead of probability of positive class for binary task in this case.
                train_data : Dataset
                    The training dataset.
                eval_name : string
@@ -2844,7 +2853,6 @@ class Booster:
                is_higher_better : bool
                    Is eval result higher better, e.g. AUC is ``is_higher_better``.

-            For binary task, the preds is probability of positive class (or margin in case of specified ``fobj``).
            For multi-class task, the preds is group by class_id first, then group by row_id.
            If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].

@@ -2867,6 +2875,8 @@ class Booster:

                preds : list or numpy 1-D array
                    The predicted values.
+                    If ``fobj`` is specified, predicted values are returned before any transformation,
+                    e.g. they are raw margin instead of probability of positive class for binary task in this case.
                valid_data : Dataset
                    The validation dataset.
                eval_name : string
@@ -2876,7 +2886,6 @@ class Booster:
                is_higher_better : bool
                    Is eval result higher better, e.g. AUC is ``is_higher_better``.

-            For binary task, the preds is probability of positive class (or margin in case of specified ``fobj``).
            For multi-class task, the preds is group by class_id first, then group by row_id.
            If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].


--- a/python-package/lightgbm/engine.py
+++ b/python-package/lightgbm/engine.py
@@ -39,14 +39,17 @@ def train(params, train_set, num_boost_round=100,

            preds : list or numpy 1-D array
                The predicted values.
+                Predicted values are returned before any transformation,
+                e.g. they are raw margin instead of probability of positive class for binary task.
            train_data : Dataset
                The training dataset.
            grad : list or numpy 1-D array
-                The value of the first order derivative (gradient) for each sample point.
+                The value of the first order derivative (gradient) of the loss
+                with respect to the elements of preds for each sample point.
            hess : list or numpy 1-D array
-                The value of the second order derivative (Hessian) for each sample point.
+                The value of the second order derivative (Hessian) of the loss
+                with respect to the elements of preds for each sample point.

-        For binary task, the preds is margin.
        For multi-class task, the preds is group by class_id first, then group by row_id.
        If you want to get i-th row preds in j-th class, the access way is score[j * num_data + i]
        and you should group grad and hess in this way as well.
@@ -58,6 +61,8 @@ def train(params, train_set, num_boost_round=100,

            preds : list or numpy 1-D array
                The predicted values.
+                If ``fobj`` is specified, predicted values are returned before any transformation,
+                e.g. they are raw margin instead of probability of positive class for binary task in this case.
            train_data : Dataset
                The training dataset.
            eval_name : string
@@ -67,7 +72,6 @@ def train(params, train_set, num_boost_round=100,
            is_higher_better : bool
                Is eval result higher better, e.g. AUC is ``is_higher_better``.

-        For binary task, the preds is probability of positive class (or margin in case of specified ``fobj``).
        For multi-class task, the preds is group by class_id first, then group by row_id.
        If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].
        To ignore the default metric corresponding to the used objective,
@@ -428,14 +432,17 @@ def cv(params, train_set, num_boost_round=100,

            preds : list or numpy 1-D array
                The predicted values.
+                Predicted values are returned before any transformation,
+                e.g. they are raw margin instead of probability of positive class for binary task.
            train_data : Dataset
                The training dataset.
            grad : list or numpy 1-D array
-                The value of the first order derivative (gradient) for each sample point.
+                The value of the first order derivative (gradient) of the loss
+                with respect to the elements of preds for each sample point.
            hess : list or numpy 1-D array
-                The value of the second order derivative (Hessian) for each sample point.
+                The value of the second order derivative (Hessian) of the loss
+                with respect to the elements of preds for each sample point.

-        For binary task, the preds is margin.
        For multi-class task, the preds is group by class_id first, then group by row_id.
        If you want to get i-th row preds in j-th class, the access way is score[j * num_data + i]
        and you should group grad and hess in this way as well.
@@ -447,6 +454,8 @@ def cv(params, train_set, num_boost_round=100,

            preds : list or numpy 1-D array
                The predicted values.
+                If ``fobj`` is specified, predicted values are returned before any transformation,
+                e.g. they are raw margin instead of probability of positive class for binary task in this case.
            train_data : Dataset
                The training dataset.
            eval_name : string
@@ -456,7 +465,6 @@ def cv(params, train_set, num_boost_round=100,
            is_higher_better : bool
                Is eval result higher better, e.g. AUC is ``is_higher_better``.

-        For binary task, the preds is probability of positive class (or margin in case of specified ``fobj``).
        For multi-class task, the preds is group by class_id first, then group by row_id.
        If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].
        To ignore the default metric corresponding to the used objective,

--- a/python-package/lightgbm/sklearn.py
+++ b/python-package/lightgbm/sklearn.py
@@ -32,6 +32,8 @@ class _ObjectiveFunctionWrapper:
                    The target values.
                y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
                    The predicted values.
+                    Predicted values are returned before any transformation,
+                    e.g. they are raw margin instead of probability of positive class for binary task.
                group : array-like
                    Group/query data.
                    Only used in the learning-to-rank task.
@@ -39,13 +41,14 @@ class _ObjectiveFunctionWrapper:
                    For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
                    where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
                grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
-                    The value of the first order derivative (gradient) for each sample point.
+                    The value of the first order derivative (gradient) of the loss
+                    with respect to the elements of y_pred for each sample point.
                hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
-                    The value of the second order derivative (Hessian) for each sample point.
+                    The value of the second order derivative (Hessian) of the loss
+                    with respect to the elements of y_pred for each sample point.

        .. note::

-            For binary task, the y_pred is margin.
            For multi-class task, the y_pred is group by class_id first, then group by row_id.
            If you want to get i-th row y_pred in j-th class, the access way is y_pred[j * num_data + i]
            and you should group grad and hess in this way as well.
@@ -65,9 +68,11 @@ class _ObjectiveFunctionWrapper:
        Returns
        -------
        grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
-            The value of the first order derivative (gradient) for each sample point.
+            The value of the first order derivative (gradient) of the loss
+            with respect to the elements of preds for each sample point.
        hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
-            The value of the second order derivative (Hessian) for each sample point.
+            The value of the second order derivative (Hessian) of the loss
+            with respect to the elements of preds for each sample point.
        """
        labels = dataset.get_label()
        argc = len(signature(self.func).parameters)
@@ -120,6 +125,8 @@ class _EvalFunctionWrapper:
                    The target values.
                y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
                    The predicted values.
+                    In case of custom ``objective``, predicted values are returned before any transformation,
+                    e.g. they are raw margin instead of probability of positive class for binary task in this case.
                weight : array-like of shape = [n_samples]
                    The weight of samples.
                group : array-like
@@ -137,7 +144,6 @@ class _EvalFunctionWrapper:

        .. note::

-            For binary task, the y_pred is probability of positive class (or margin in case of custom ``objective``).
            For multi-class task, the y_pred is group by class_id first, then group by row_id.
            If you want to get i-th row y_pred in j-th class, the access way is y_pred[j * num_data + i].
        """
@@ -272,6 +278,8 @@ _lgbmmodel_doc_custom_eval_note = """
            The target values.
        y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
            The predicted values.
+            In case of custom ``objective``, predicted values are returned before any transformation,
+            e.g. they are raw margin instead of probability of positive class for binary task in this case.
        weight : array-like of shape = [n_samples]
            The weight of samples.
        group : array-like
@@ -287,7 +295,6 @@ _lgbmmodel_doc_custom_eval_note = """
        is_higher_better : bool
            Is eval result higher better, e.g. AUC is ``is_higher_better``.

-    For binary task, the y_pred is probability of positive class (or margin in case of custom ``objective``).
    For multi-class task, the y_pred is group by class_id first, then group by row_id.
    If you want to get i-th row y_pred in j-th class, the access way is y_pred[j * num_data + i].
 """
@@ -434,6 +441,8 @@ class LGBMModel(_LGBMModelBase):
                The target values.
            y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
                The predicted values.
+                Predicted values are returned before any transformation,
+                e.g. they are raw margin instead of probability of positive class for binary task.
            group : array-like
                Group/query data.
                Only used in the learning-to-rank task.
@@ -441,11 +450,12 @@ class LGBMModel(_LGBMModelBase):
                For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
                where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
            grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
-                The value of the first order derivative (gradient) for each sample point.
+                The value of the first order derivative (gradient) of the loss
+                with respect to the elements of y_pred for each sample point.
            hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
-                The value of the second order derivative (Hessian) for each sample point.
+                The value of the second order derivative (Hessian) of the loss
+                with respect to the elements of y_pred for each sample point.

-        For binary task, the y_pred is margin.
        For multi-class task, the y_pred is group by class_id first, then group by row_id.
        If you want to get i-th row y_pred in j-th class, the access way is y_pred[j * num_data + i]
        and you should group grad and hess in this way as well.