Unverified Commit 5e9b0209 authored by Nikita Titov's avatar Nikita Titov Committed by GitHub
Browse files

[python][docs] fix type hints for custom functions and remove vague `array-like` wording (#4816)

* Update sklearn.py

* Update engine.py

* Update sklearn.py

* Update engine.py

* Update basic.py

* Update engine.py
parent a1fdeb1f
......@@ -112,6 +112,7 @@ _LIB = _load_lib()
NUMERIC_TYPES = (int, float, bool)
_ArrayLike = Union[List, np.ndarray, pd_Series]
def _safe_call(ret: int) -> None:
......@@ -705,7 +706,7 @@ class Sequence(abc.ABC):
Returns
-------
result : numpy 1-D array, numpy 2-D array
result : numpy 1-D array or numpy 2-D array
1-D array if idx is int, 2-D array if idx is slice or list.
"""
raise NotImplementedError("Sub-classes of lightgbm.Sequence must implement __getitem__()")
......@@ -2264,7 +2265,7 @@ class Dataset:
Returns
-------
feature_names : list
feature_names : list of str
The names of columns (features) in the Dataset.
"""
if self.handle is None:
......@@ -2994,16 +2995,16 @@ class Booster:
Should accept two parameters: preds, train_data,
and return (grad, hess).
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task.
train_data : Dataset
The training dataset.
grad : list or numpy 1-D array
grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point.
hess : list or numpy 1-D array
hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point.
......@@ -3062,10 +3063,10 @@ class Booster:
Parameters
----------
grad : list or numpy 1-D array
grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss
with respect to the elements of score for each sample point.
hess : list or numpy 1-D array
hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss
with respect to the elements of score for each sample point.
......@@ -3186,7 +3187,7 @@ class Booster:
Should accept two parameters: preds, eval_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case.
......@@ -3234,7 +3235,7 @@ class Booster:
Should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case.
......@@ -3267,7 +3268,7 @@ class Booster:
Should accept two parameters: preds, valid_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case.
......@@ -3656,7 +3657,7 @@ class Booster:
Returns
-------
result : list
result : list of str
List with names of features.
"""
num_feature = self.num_feature()
......
......@@ -9,15 +9,15 @@ from typing import Any, Callable, Dict, List, Optional, Tuple, Union
import numpy as np
from . import callback
from .basic import Booster, Dataset, LightGBMError, _ConfigAliases, _InnerPredictor, _log_warning
from .basic import Booster, Dataset, LightGBMError, _ArrayLike, _ConfigAliases, _InnerPredictor, _log_warning
from .compat import SKLEARN_INSTALLED, _LGBMGroupKFold, _LGBMStratifiedKFold
_LGBM_CustomObjectiveFunction = Callable[
[Union[List, np.ndarray], Dataset],
Tuple[Union[List, np.ndarray], Union[List, np.ndarray]]
[np.ndarray, Dataset],
Tuple[_ArrayLike, _ArrayLike]
]
_LGBM_CustomMetricFunction = Callable[
[Union[List, np.ndarray], Dataset],
[np.ndarray, Dataset],
Tuple[str, float, bool]
]
......@@ -59,16 +59,16 @@ def train(
Should accept two parameters: preds, train_data,
and return (grad, hess).
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task.
train_data : Dataset
The training dataset.
grad : list or numpy 1-D array
grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point.
hess : list or numpy 1-D array
hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point.
......@@ -81,7 +81,7 @@ def train(
Each evaluation function should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case.
......@@ -469,16 +469,16 @@ def cv(params, train_set, num_boost_round=100,
Should accept two parameters: preds, train_data,
and return (grad, hess).
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task.
train_data : Dataset
The training dataset.
grad : list or numpy 1-D array
grad : list, numpy 1-D array or pandas Series
The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point.
hess : list or numpy 1-D array
hess : list, numpy 1-D array or pandas Series
The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point.
......@@ -491,7 +491,7 @@ def cv(params, train_set, num_boost_round=100,
Each evaluation function should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
preds : numpy 1-D array
The predicted values.
If ``fobj`` is specified, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case.
......
......@@ -6,15 +6,14 @@ from typing import Callable, Dict, List, Optional, Tuple, Union
import numpy as np
from .basic import Dataset, LightGBMError, _choose_param_value, _ConfigAliases, _log_warning
from .basic import Dataset, LightGBMError, _ArrayLike, _choose_param_value, _ConfigAliases, _log_warning
from .callback import log_evaluation, record_evaluation
from .compat import (SKLEARN_INSTALLED, LGBMNotFittedError, _LGBMAssertAllFinite, _LGBMCheckArray,
_LGBMCheckClassificationTargets, _LGBMCheckSampleWeight, _LGBMCheckXY, _LGBMClassifierBase,
_LGBMComputeSampleWeight, _LGBMLabelEncoder, _LGBMModelBase, _LGBMRegressorBase, dt_DataTable,
pd_DataFrame, pd_Series)
pd_DataFrame)
from .engine import train
_ArrayLike = Union[List, np.ndarray, pd_Series]
_EvalResultType = Tuple[str, float, bool]
_LGBM_ScikitCustomObjectiveFunction = Union[
......@@ -58,22 +57,22 @@ class _ObjectiveFunctionWrapper:
Expects a callable with signature ``func(y_true, y_pred)`` or ``func(y_true, y_pred, group)``
and returns (grad, hess):
y_true : array-like of shape = [n_samples]
y_true : numpy 1-D array of shape = [n_samples]
The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values.
Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task.
group : array-like
group : numpy 1-D array
Group/query data.
Only used in the learning-to-rank task.
sum(group) = n_samples.
For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
grad : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss
with respect to the elements of y_pred for each sample point.
hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
hess : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss
with respect to the elements of y_pred for each sample point.
......@@ -90,17 +89,17 @@ class _ObjectiveFunctionWrapper:
Parameters
----------
preds : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
preds : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values.
dataset : Dataset
The training dataset.
Returns
-------
grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
grad : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss
with respect to the elements of preds for each sample point.
hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
hess : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss
with respect to the elements of preds for each sample point.
"""
......@@ -151,15 +150,15 @@ class _EvalFunctionWrapper:
and returns (eval_name, eval_result, is_higher_better) or
list of (eval_name, eval_result, is_higher_better):
y_true : array-like of shape = [n_samples]
y_true : numpy 1-D array of shape = [n_samples]
The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values.
In case of custom ``objective``, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case.
weight : array-like of shape = [n_samples]
weight : numpy 1-D array of shape = [n_samples]
The weight of samples.
group : array-like
group : numpy 1-D array
Group/query data.
Only used in the learning-to-rank task.
sum(group) = n_samples.
......@@ -184,7 +183,7 @@ class _EvalFunctionWrapper:
Parameters
----------
preds : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
preds : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values.
dataset : Dataset
The training dataset.
......@@ -304,15 +303,15 @@ _lgbmmodel_doc_custom_eval_note = """
and returns (eval_name, eval_result, is_higher_better) or
list of (eval_name, eval_result, is_higher_better):
y_true : array-like of shape = [n_samples]
y_true : numpy 1-D array of shape = [n_samples]
The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values.
In case of custom ``objective``, predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task in this case.
weight : array-like of shape = [n_samples]
weight : numpy 1-D array of shape = [n_samples]
The weight of samples.
group : array-like
group : numpy 1-D array
Group/query data.
Only used in the learning-to-rank task.
sum(group) = n_samples.
......@@ -481,22 +480,22 @@ class LGBMModel(_LGBMModelBase):
``objective(y_true, y_pred) -> grad, hess`` or
``objective(y_true, y_pred, group) -> grad, hess``:
y_true : array-like of shape = [n_samples]
y_true : numpy 1-D array of shape = [n_samples]
The target values.
y_pred : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
y_pred : numpy 1-D array of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The predicted values.
Predicted values are returned before any transformation,
e.g. they are raw margin instead of probability of positive class for binary task.
group : array-like
group : numpy 1-D array
Group/query data.
Only used in the learning-to-rank task.
sum(group) = n_samples.
For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
grad : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
grad : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss
with respect to the elements of y_pred for each sample point.
hess : array-like of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
hess : list, numpy 1-D array or pandas Series of shape = [n_samples] or shape = [n_samples * n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss
with respect to the elements of y_pred for each sample point.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment