Commit bd7274ba authored by wxchan's avatar wxchan Committed by Guolin Ke
Browse files

add callbacks to sklearn interface (#150)

parent 8c6933ec
......@@ -55,9 +55,9 @@ The methods of each Class is in alphabetical order.
Categorical features,
type int represents index,
type str represents feature names (need to specify feature_name as well)
params: dict, optional
params : dict, optional
Other parameters
free_raw_data: Bool
free_raw_data : Bool
True if need to free raw data after construct inner dataset
......@@ -78,7 +78,7 @@ The methods of each Class is in alphabetical order.
Group/query size for dataset
silent : boolean, optional
Whether print messages during construction
params: dict, optional
params : dict, optional
Other parameters
......@@ -400,7 +400,7 @@ The methods of each Class is in alphabetical order.
----------
filename : str
Filename to save
num_iteration: int
num_iteration : int
Number of iteration that want to save. < 0 means save all
......@@ -497,14 +497,15 @@ The methods of each Class is in alphabetical order.
or the boosting stage found by using `early_stopping_rounds` is also printed.
Example: with verbose_eval=4 and at least one item in evals,
an evaluation metric is printed every 4 (instead of 1) boosting stages.
learning_rates: list or function
learning_rates : list or function
List of learning rate for each boosting round
or a customized function that calculates learning_rate
in terms of current number of round (e.g. yields learning rate decay)
- list l: learning_rate = l[current_round]
- function f: learning_rate = f(current_round)
callbacks : list of callback functions
List of callback functions that are applied at end of each iteration.
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Returns
-------
......@@ -643,13 +644,13 @@ The methods of each Class is in alphabetical order.
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class]
y_pred: array_like of shape [n_samples] or shape[n_samples * n_class]
The predicted values
group: array_like
group/query data, used for ranking task
grad: array_like of shape [n_samples] or shape[n_samples* n_class]
grad: array_like of shape [n_samples] or shape[n_samples * n_class]
The value of the gradient for each sample point.
hess: array_like of shape [n_samples] or shape[n_samples* n_class]
hess: array_like of shape [n_samples] or shape[n_samples * n_class]
The value of the second derivative for each sample point
for multi-class task, the y_pred is group by class_id first, then group by row_id
......@@ -703,7 +704,7 @@ The methods of each Class is in alphabetical order.
Array of normailized feature importances
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric=None, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None)
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric=None, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None, callbacks=None)
Fit the gradient boosting model.
......@@ -721,12 +722,12 @@ The methods of each Class is in alphabetical order.
group data of training data
eval_set : list, optional
A list of (X, y) tuple pairs to use as a validation set for early-stopping
eval_sample_weight : List or Dict of array
weight of eval data
eval_init_score : List or Dict of array
init score of eval data
eval_group : List or Dict of array
group data of eval data
eval_sample_weight : list or dict of array
weight of eval data; if you use dict, the index should start from 0
eval_init_score : list or dict of array
init score of eval data; if you use dict, the index should start from 0
eval_group : list or dict of array
group data of eval data; if you use dict, the index should start from 0
eval_metric : str, list of str, callable, optional
If a str, should be a built-in evaluation metric to use.
If callable, a custom evaluation metric, see note for more details.
......@@ -739,7 +740,10 @@ The methods of each Class is in alphabetical order.
categorical_feature : list of str or int
Categorical features,
type int represents index,
type str represents feature names (need to specify feature_name as well)
type str represents feature names (need to specify feature_name as well).
callbacks : list of callback functions
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Note
----
......@@ -807,7 +811,7 @@ The methods of each Class is in alphabetical order.
###LGBMRanker
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric='ndcg', eval_at=1, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None)
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric='ndcg', eval_at=1, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None, callbacks=None)
Most arguments are same as Common Methods except:
......
......@@ -74,7 +74,8 @@ def train(params, train_set, num_boost_round=100,
- list l: learning_rate = l[current_round]
- function f: learning_rate = f(current_round)
callbacks : list of callback functions
List of callback functions that are applied at end of each iteration.
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Returns
-------
......@@ -319,7 +320,8 @@ def cv(params, train_set, num_boost_round=10, nfold=5, stratified=False,
seed : int
Seed used to generate the folds (passed to numpy.random.seed).
callbacks : list of callback functions
List of callback functions that are applied at end of each iteration.
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Returns
-------
......
......@@ -35,7 +35,7 @@ def _objective_function_wrapper(func):
Expects a callable with signature ``func(y_true, y_pred)`` or ``func(y_true, y_pred, group):
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class] (for multi-class)
y_pred: array_like of shape [n_samples] or shape[n_samples * n_class] (for multi-class)
The predicted values
group: array_like
group/query data, used for ranking task
......@@ -46,7 +46,7 @@ def _objective_function_wrapper(func):
The new objective function as expected by ``lightgbm.engine.train``.
The signature is ``new_func(preds, dataset)``:
preds: array_like, shape [n_samples] or shape[n_samples* n_class]
preds: array_like, shape [n_samples] or shape[n_samples * n_class]
The predicted values
dataset: ``dataset``
The training set from which the labels will be extracted using
......@@ -97,7 +97,7 @@ def _eval_function_wrapper(func):
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class] (for multi-class)
y_pred: array_like of shape [n_samples] or shape[n_samples * n_class] (for multi-class)
The predicted values
weight: array_like of shape [n_samples]
The weight of samples
......@@ -110,7 +110,7 @@ def _eval_function_wrapper(func):
The new eval function as expected by ``lightgbm.engine.train``.
The signature is ``new_func(preds, dataset)``:
preds: array_like, shape [n_samples] or shape[n_samples* n_class]
preds: array_like, shape [n_samples] or shape[n_samples * n_class]
The predicted values
dataset: ``dataset``
The training set from which the labels will be extracted using
......@@ -209,13 +209,13 @@ class LGBMModel(LGBMModelBase):
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class]
y_pred: array_like of shape [n_samples] or shape[n_samples * n_class]
The predicted values
group: array_like
group/query data, used for ranking task
grad: array_like of shape [n_samples] or shape[n_samples* n_class]
grad: array_like of shape [n_samples] or shape[n_samples * n_class]
The value of the gradient for each sample point.
hess: array_like of shape [n_samples] or shape[n_samples* n_class]
hess: array_like of shape [n_samples] or shape[n_samples * n_class]
The value of the second derivative for each sample point
for multi-class task, the y_pred is group by class_id first, then group by row_id
......@@ -276,7 +276,8 @@ class LGBMModel(LGBMModelBase):
eval_init_score=None, eval_group=None,
eval_metric=None,
early_stopping_rounds=None, verbose=True,
feature_name=None, categorical_feature=None):
feature_name=None, categorical_feature=None,
callbacks=None):
"""
Fit the gradient boosting model
......@@ -312,6 +313,9 @@ class LGBMModel(LGBMModelBase):
Categorical features,
type int represents index,
type str represents feature names (need to specify feature_name as well)
callbacks : list of callback functions
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Note
----
......@@ -398,7 +402,8 @@ class LGBMModel(LGBMModelBase):
early_stopping_rounds=early_stopping_rounds,
evals_result=evals_result, fobj=self.fobj, feval=feval,
verbose_eval=verbose, feature_name=feature_name,
categorical_feature=categorical_feature)
categorical_feature=categorical_feature,
callbacks=callbacks)
if evals_result:
for val in evals_result.items():
......@@ -525,7 +530,8 @@ class LGBMClassifier(LGBMModel, LGBMClassifierBase):
eval_init_score=None,
eval_metric="binary_logloss",
early_stopping_rounds=None, verbose=True,
feature_name=None, categorical_feature=None):
feature_name=None, categorical_feature=None,
callbacks=None):
self._le = LGBMLabelEncoder().fit(y)
y = self._le.transform(y)
......@@ -547,7 +553,8 @@ class LGBMClassifier(LGBMModel, LGBMClassifierBase):
eval_metric=eval_metric,
early_stopping_rounds=early_stopping_rounds,
verbose=verbose, feature_name=feature_name,
categorical_feature=categorical_feature)
categorical_feature=categorical_feature,
callbacks=callbacks)
return self
def predict(self, data, raw_score=False, num_iteration=0):
......@@ -616,7 +623,8 @@ class LGBMRanker(LGBMModel):
eval_init_score=None, eval_group=None,
eval_metric='ndcg', eval_at=1,
early_stopping_rounds=None, verbose=True,
feature_name=None, categorical_feature=None):
feature_name=None, categorical_feature=None,
callbacks=None):
"""
Most arguments like common methods except following:
......@@ -633,10 +641,9 @@ class LGBMRanker(LGBMModel):
raise ValueError("Eval_group cannot be None when eval_set is not None")
elif len(eval_group) != len(eval_set):
raise ValueError("Length of eval_group should equal to eval_set")
else:
for inner_group in eval_group:
if inner_group is None:
raise ValueError("Should set group for all eval dataset for ranking task")
elif (isinstance(eval_group, dict) and any(i not in eval_group or eval_group[i] is None for i in range(len(eval_group)))) \
or (isinstance(eval_group, list) and any(group is None for group in eval_group)):
raise ValueError("Should set group for all eval dataset for ranking task; if you use dict, the index should start from 0")
if eval_at is not None:
self.eval_at = eval_at
......@@ -647,5 +654,6 @@ class LGBMRanker(LGBMModel):
eval_metric=eval_metric,
early_stopping_rounds=early_stopping_rounds,
verbose=verbose, feature_name=feature_name,
categorical_feature=categorical_feature)
categorical_feature=categorical_feature,
callbacks=callbacks)
return self
......@@ -43,7 +43,14 @@ class TestSklearn(unittest.TestCase):
X_train, y_train = load_svmlight_file('../../examples/lambdarank/rank.train')
X_test, y_test = load_svmlight_file('../../examples/lambdarank/rank.test')
q_train = np.loadtxt('../../examples/lambdarank/rank.train.query')
lgb_model = lgb.LGBMRanker().fit(X_train, y_train, group=q_train, eval_at=[1])
q_test = np.loadtxt('../../examples/lambdarank/rank.test.query')
lgb_model = lgb.LGBMRanker().fit(X_train, y_train,
group=q_train,
eval_set=[(X_test, y_test)],
eval_group=[q_test],
eval_at=[1],
verbose=False,
callbacks=[lgb.reset_parameter(learning_rate=lambda x: 0.95 ** x * 0.1)])
def test_regression_with_custom_objective(self):
def objective_ls(y_true, y_pred):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment