"src/vscode:/vscode.git/clone" did not exist on "6a7470a2b0b0f3e5b82b6798ca6e69ad02fa23a1"
Unverified Commit 5e9b0209 authored by Nikita Titov's avatar Nikita Titov Committed by GitHub
Browse files

[python][docs] fix type hints for custom functions and remove vague `array-like` wording (#4816)

* Update sklearn.py

* Update engine.py

* Update sklearn.py

* Update engine.py

* Update basic.py

* Update engine.py
parent a1fdeb1f
...@@ -112,6 +112,7 @@ _LIB = _load_lib() ...@@ -112,6 +112,7 @@ _LIB = _load_lib()
NUMERIC_TYPES = (int, float, bool) NUMERIC_TYPES = (int, float, bool)
_ArrayLike = Union[List, np.ndarray, pd_Series]
def _safe_call(ret: int) -> None: def _safe_call(ret: int) -> None:
...@@ -705,7 +706,7 @@ class Sequence(abc.ABC): ...@@ -705,7 +706,7 @@ class Sequence(abc.ABC):
Returns Returns
------- -------
result : numpy 1-D array, numpy 2-D array result : numpy 1-D array or numpy 2-D array
1-D array if idx is int, 2-D array if idx is slice or list. 1-D array if idx is int, 2-D array if idx is slice or list.
""" """
raise NotImplementedError("Sub-classes of lightgbm.Sequence must implement __getitem__()") raise NotImplementedError("Sub-classes of lightgbm.Sequence must implement __getitem__()")
...@@ -2264,7 +2265,7 @@ class Dataset: ...@@ -2264,7 +2265,7 @@ class Dataset:
Returns Returns
------- -------
feature_names : list feature_names : list of str
The names of columns (features) in the Dataset. The names of columns (features) in the Dataset.
""" """
if self.handle is None: if self.handle is None:
...@@ -2994,16 +2995,16 @@ class Booster: ...@@ -2994,16 +2995,16 @@ class Booster:
Should accept two parameters: preds, train_data, Should accept two parameters: preds, train_data,
and return (grad, hess). and return (grad, hess).
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
Predicted values are returned before any transformation, Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task. e.g. they are raw margin instead of probability of positive class for binary task.
train_data : Dataset train_data : Dataset
The training dataset. The training dataset.
grad : list or numpy 1-D array grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
hess : list or numpy 1-D array hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
...@@ -3062,10 +3063,10 @@ class Booster: ...@@ -3062,10 +3063,10 @@ class Booster:
Parameters Parameters
---------- ----------
grad : list or numpy 1-D array grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss The value of the first order derivative (gradient) of the loss
with respect to the elements of score for each sample point. with respect to the elements of score for each sample point.
hess : list or numpy 1-D array hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss The value of the second order derivative (Hessian) of the loss
with respect to the elements of score for each sample point. with respect to the elements of score for each sample point.
...@@ -3186,7 +3187,7 @@ class Booster: ...@@ -3186,7 +3187,7 @@ class Booster:
Should accept two parameters: preds, eval_data, Should accept two parameters: preds, eval_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples. and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation, If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case. e.g. they are raw margin instead of probability of positive class for binary task in this case.
...@@ -3234,7 +3235,7 @@ class Booster: ...@@ -3234,7 +3235,7 @@ class Booster:
Should accept two parameters: preds, train_data, Should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples. and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation, If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case. e.g. they are raw margin instead of probability of positive class for binary task in this case.
...@@ -3267,7 +3268,7 @@ class Booster: ...@@ -3267,7 +3268,7 @@ class Booster:
Should accept two parameters: preds, valid_data, Should accept two parameters: preds, valid_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples. and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation, If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case. e.g. they are raw margin instead of probability of positive class for binary task in this case.
...@@ -3656,7 +3657,7 @@ class Booster: ...@@ -3656,7 +3657,7 @@ class Booster:
Returns Returns
------- -------
result : list result : list of str
List with names of features. List with names of features.
""" """
num_feature = self.num_feature() num_feature = self.num_feature()
......
...@@ -9,15 +9,15 @@ from typing import Any, Callable, Dict, List, Optional, Tuple, Union ...@@ -9,15 +9,15 @@ from typing import Any, Callable, Dict, List, Optional, Tuple, Union
import numpy as np import numpy as np
from . import callback from . import callback
from .basic import Booster, Dataset, LightGBMError, _ConfigAliases, _InnerPredictor, _log_warning from .basic import Booster, Dataset, LightGBMError, _ArrayLike, _ConfigAliases, _InnerPredictor, _log_warning
from .compat import SKLEARN_INSTALLED, _LGBMGroupKFold, _LGBMStratifiedKFold from .compat import SKLEARN_INSTALLED, _LGBMGroupKFold, _LGBMStratifiedKFold
_LGBM_CustomObjectiveFunction = Callable[ _LGBM_CustomObjectiveFunction = Callable[
[Union[List, np.ndarray], Dataset], [np.ndarray, Dataset],
Tuple[Union[List, np.ndarray], Union[List, np.ndarray]] Tuple[_ArrayLike, _ArrayLike]
] ]
_LGBM_CustomMetricFunction = Callable[ _LGBM_CustomMetricFunction = Callable[
[Union[List, np.ndarray], Dataset], [np.ndarray, Dataset],
Tuple[str, float, bool] Tuple[str, float, bool]
] ]
...@@ -59,16 +59,16 @@ def train( ...@@ -59,16 +59,16 @@ def train(
Should accept two parameters: preds, train_data, Should accept two parameters: preds, train_data,
and return (grad, hess). and return (grad, hess).
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
Predicted values are returned before any transformation, Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task. e.g. they are raw margin instead of probability of positive class for binary task.
train_data : Dataset train_data : Dataset
The training dataset. The training dataset.
grad : list or numpy 1-D array grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
hess : list or numpy 1-D array hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
...@@ -81,7 +81,7 @@ def train( ...@@ -81,7 +81,7 @@ def train(
Each evaluation function should accept two parameters: preds, train_data, Each evaluation function should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples. and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation, If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case. e.g. they are raw margin instead of probability of positive class for binary task in this case.
...@@ -469,16 +469,16 @@ def cv(params, train_set, num_boost_round=100, ...@@ -469,16 +469,16 @@ def cv(params, train_set, num_boost_round=100,
Should accept two parameters: preds, train_data, Should accept two parameters: preds, train_data,
and return (grad, hess). and return (grad, hess).
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
Predicted values are returned before any transformation, Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task. e.g. they are raw margin instead of probability of positive class for binary task.
train_data : Dataset train_data : Dataset
The training dataset. The training dataset.
grad : list or numpy 1-D array grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
hess : list or numpy 1-D array hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
...@@ -491,7 +491,7 @@ def cv(params, train_set, num_boost_round=100, ...@@ -491,7 +491,7 @@ def cv(params, train_set, num_boost_round=100,
Each evaluation function should accept two parameters: preds, train_data, Each evaluation function should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples. and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array preds : numpy 1-D array
The predicted values. The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation, If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case. e.g. they are raw margin instead of probability of positive class for binary task in this case.
......
...@@ -6,15 +6,14 @@ from typing import Callable, Dict, List, Optional, Tuple, Union ...@@ -6,15 +6,14 @@ from typing import Callable, Dict, List, Optional, Tuple, Union
import numpy as np import numpy as np
from .basic import Dataset, LightGBMError, _choose_param_value, _ConfigAliases, _log_warning from .basic import Dataset, LightGBMError, _ArrayLike, _choose_param_value, _ConfigAliases, _log_warning
from .callback import log_evaluation, record_evaluation from .callback import log_evaluation, record_evaluation
from .compat import (SKLEARN_INSTALLED, LGBMNotFittedError, _LGBMAssertAllFinite, _LGBMCheckArray, from .compat import (SKLEARN_INSTALLED, LGBMNotFittedError, _LGBMAssertAllFinite, _LGBMCheckArray,
_LGBMCheckClassificationTargets, _LGBMCheckSampleWeight, _LGBMCheckXY, _LGBMClassifierBase, _LGBMCheckClassificationTargets, _LGBMCheckSampleWeight, _LGBMCheckXY, _LGBMClassifierBase,
_LGBMComputeSampleWeight, _LGBMLabelEncoder, _LGBMModelBase, _LGBMRegressorBase, dt_DataTable, _LGBMComputeSampleWeight, _LGBMLabelEncoder, _LGBMModelBase, _LGBMRegressorBase, dt_DataTable,
pd_DataFrame, pd_Series) pd_DataFrame)
from .engine import train from .engine import train
_ArrayLike = Union[List, np.ndarray, pd_Series]
_EvalResultType = Tuple[str, float, bool] _EvalResultType = Tuple[str, float, bool]
_LGBM_ScikitCustomObjectiveFunction = Union[ _LGBM_ScikitCustomObjectiveFunction = Union[
...@@ -58,22 +57,22 @@ class _ObjectiveFunctionWrapper: ...@@ -58,22 +57,22 @@ class _ObjectiveFunctionWrapper:
Expects a callable with signature ``func(y_true, y_pred)`` or ``func(y_true, y_pred, group)`` Expects a callable with signature ``func(y_true, y_pred)`` or ``func(y_true, y_pred, group)``
and returns (grad, hess): and returns (grad, hess):
y_true : array-like of shape = [n_samples] y_true : numpy 1-D array of shape = [n_samples]
The target values. The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values. The predicted values.
Predicted values are returned before any transformation, Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task. e.g. they are raw margin instead of probability of positive class for binary task.
group : array-like group : numpy 1-D array
Group/query data. Group/query data.
Only used in the learning-to-rank task. Only used in the learning-to-rank task.
sum(group) = n_samples. sum(group) = n_samples.
For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups, For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc. where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) grad : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss The value of the first order derivative (gradient) of the loss
with respect to the elements of y_pred for each sample point. with respect to the elements of y_pred for each sample point.
hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) hess : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss The value of the second order derivative (Hessian) of the loss
with respect to the elements of y_pred for each sample point. with respect to the elements of y_pred for each sample point.
...@@ -90,17 +89,17 @@ class _ObjectiveFunctionWrapper: ...@@ -90,17 +89,17 @@ class _ObjectiveFunctionWrapper:
Parameters Parameters
---------- ----------
preds : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) preds : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values. The predicted values.
dataset : Dataset dataset : Dataset
The training dataset. The training dataset.
Returns Returns
------- -------
grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) grad : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) hess : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point. with respect to the elements of preds for each sample point.
""" """
...@@ -151,15 +150,15 @@ class _EvalFunctionWrapper: ...@@ -151,15 +150,15 @@ class _EvalFunctionWrapper:
and returns (eval_name, eval_result, is_higher_better) or and returns (eval_name, eval_result, is_higher_better) or
list of (eval_name, eval_result, is_higher_better): list of (eval_name, eval_result, is_higher_better):
y_true : array-like of shape = [n_samples] y_true : numpy 1-D array of shape = [n_samples]
The target values. The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values. The predicted values.
In case of custom ``objective``, predicted values are returned before any transformation, In case of custom ``objective``, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case. e.g. they are raw margin instead of probability of positive class for binary task in this case.
weight : array-like of shape = [n_samples] weight : numpy 1-D array of shape = [n_samples]
The weight of samples. The weight of samples.
group : array-like group : numpy 1-D array
Group/query data. Group/query data.
Only used in the learning-to-rank task. Only used in the learning-to-rank task.
sum(group) = n_samples. sum(group) = n_samples.
...@@ -184,7 +183,7 @@ class _EvalFunctionWrapper: ...@@ -184,7 +183,7 @@ class _EvalFunctionWrapper:
Parameters Parameters
---------- ----------
preds : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) preds : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values. The predicted values.
dataset : Dataset dataset : Dataset
The training dataset. The training dataset.
...@@ -304,15 +303,15 @@ _lgbmmodel_doc_custom_eval_note = """ ...@@ -304,15 +303,15 @@ _lgbmmodel_doc_custom_eval_note = """
and returns (eval_name, eval_result, is_higher_better) or and returns (eval_name, eval_result, is_higher_better) or
list of (eval_name, eval_result, is_higher_better): list of (eval_name, eval_result, is_higher_better):
y_true : array-like of shape = [n_samples] y_true : numpy 1-D array of shape = [n_samples]
The target values. The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values. The predicted values.
In case of custom ``objective``, predicted values are returned before any transformation, In case of custom ``objective``, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case. e.g. they are raw margin instead of probability of positive class for binary task in this case.
weight : array-like of shape = [n_samples] weight : numpy 1-D array of shape = [n_samples]
The weight of samples. The weight of samples.
group : array-like group : numpy 1-D array
Group/query data. Group/query data.
Only used in the learning-to-rank task. Only used in the learning-to-rank task.
sum(group) = n_samples. sum(group) = n_samples.
...@@ -481,22 +480,22 @@ class LGBMModel(_LGBMModelBase): ...@@ -481,22 +480,22 @@ class LGBMModel(_LGBMModelBase):
``objective(y_true, y_pred) -> grad, hess`` or ``objective(y_true, y_pred) -> grad, hess`` or
``objective(y_true, y_pred, group) -> grad, hess``: ``objective(y_true, y_pred, group) -> grad, hess``:
y_true : array-like of shape = [n_samples] y_true : numpy 1-D array of shape = [n_samples]
The target values. The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values. The predicted values.
Predicted values are returned before any transformation, Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task. e.g. they are raw margin instead of probability of positive class for binary task.
group : array-like group : numpy 1-D array
Group/query data. Group/query data.
Only used in the learning-to-rank task. Only used in the learning-to-rank task.
sum(group) = n_samples. sum(group) = n_samples.
For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups, For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc. where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) grad : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss The value of the first order derivative (gradient) of the loss
with respect to the elements of y_pred for each sample point. with respect to the elements of y_pred for each sample point.
hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task) hess : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss The value of the second order derivative (Hessian) of the loss
with respect to the elements of y_pred for each sample point. with respect to the elements of y_pred for each sample point.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment