Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
tianlh
LightGBM-DCU
Commits
bd7274ba
Commit
bd7274ba
authored
Dec 31, 2016
by
wxchan
Committed by
Guolin Ke
Dec 31, 2016
Browse files
add callbacks to sklearn interface (#150)
parent
8c6933ec
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
59 additions
and
38 deletions
+59
-38
docs/Python-API.md
docs/Python-API.md
+22
-18
python-package/lightgbm/engine.py
python-package/lightgbm/engine.py
+4
-2
python-package/lightgbm/sklearn.py
python-package/lightgbm/sklearn.py
+25
-17
tests/python_package_test/test_sklearn.py
tests/python_package_test/test_sklearn.py
+8
-1
No files found.
docs/Python-API.md
View file @
bd7274ba
...
...
@@ -55,9 +55,9 @@ The methods of each Class is in alphabetical order.
Categorical features,
type int represents index,
type str represents feature names (need to specify feature_name as well)
params: dict, optional
params
: dict, optional
Other parameters
free_raw_data: Bool
free_raw_data
: Bool
True if need to free raw data after construct inner dataset
...
...
@@ -78,7 +78,7 @@ The methods of each Class is in alphabetical order.
Group/query size for dataset
silent : boolean, optional
Whether print messages during construction
params: dict, optional
params
: dict, optional
Other parameters
...
...
@@ -400,7 +400,7 @@ The methods of each Class is in alphabetical order.
----------
filename : str
Filename to save
num_iteration: int
num_iteration
: int
Number of iteration that want to save. < 0 means save all
...
...
@@ -497,14 +497,15 @@ The methods of each Class is in alphabetical order.
or the boosting stage found by using `early_stopping_rounds` is also printed.
Example: with verbose_eval=4 and at least one item in evals,
an evaluation metric is printed every 4 (instead of 1) boosting stages.
learning_rates: list or function
learning_rates
: list or function
List of learning rate for each boosting round
or a customized function that calculates learning_rate
in terms of current number of round (e.g. yields learning rate decay)
- list l: learning_rate = l[current_round]
- function f: learning_rate = f(current_round)
callbacks : list of callback functions
List of callback functions that are applied at end of each iteration.
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Returns
-------
...
...
@@ -643,13 +644,13 @@ The methods of each Class is in alphabetical order.
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class]
y_pred: array_like of shape [n_samples] or shape[n_samples
* n_class]
The predicted values
group: array_like
group/query data, used for ranking task
grad: array_like of shape [n_samples] or shape[n_samples* n_class]
grad: array_like of shape [n_samples] or shape[n_samples
* n_class]
The value of the gradient for each sample point.
hess: array_like of shape [n_samples] or shape[n_samples* n_class]
hess: array_like of shape [n_samples] or shape[n_samples
* n_class]
The value of the second derivative for each sample point
for multi-class task, the y_pred is group by class_id first, then group by row_id
...
...
@@ -703,7 +704,7 @@ The methods of each Class is in alphabetical order.
Array of normailized feature importances
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric=None, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None)
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric=None, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None
, callbacks=None
)
Fit the gradient boosting model.
...
...
@@ -721,12 +722,12 @@ The methods of each Class is in alphabetical order.
group data of training data
eval_set : list, optional
A list of (X, y) tuple pairs to use as a validation set for early-stopping
eval_sample_weight :
L
ist or
D
ict of array
weight of eval data
eval_init_score :
L
ist or
D
ict of array
init score of eval data
eval_group :
L
ist or
D
ict of array
group data of eval data
eval_sample_weight :
l
ist or
d
ict of array
weight of eval data
; if you use dict, the index should start from 0
eval_init_score :
l
ist or
d
ict of array
init score of eval data
; if you use dict, the index should start from 0
eval_group :
l
ist or
d
ict of array
group data of eval data
; if you use dict, the index should start from 0
eval_metric : str, list of str, callable, optional
If a str, should be a built-in evaluation metric to use.
If callable, a custom evaluation metric, see note for more details.
...
...
@@ -739,7 +740,10 @@ The methods of each Class is in alphabetical order.
categorical_feature : list of str or int
Categorical features,
type int represents index,
type str represents feature names (need to specify feature_name as well)
type str represents feature names (need to specify feature_name as well).
callbacks : list of callback functions
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Note
----
...
...
@@ -807,7 +811,7 @@ The methods of each Class is in alphabetical order.
###LGBMRanker
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric='ndcg', eval_at=1, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None)
####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric='ndcg', eval_at=1, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None
, callbacks=None
)
Most arguments are same as Common Methods except:
...
...
python-package/lightgbm/engine.py
View file @
bd7274ba
...
...
@@ -74,7 +74,8 @@ def train(params, train_set, num_boost_round=100,
- list l: learning_rate = l[current_round]
- function f: learning_rate = f(current_round)
callbacks : list of callback functions
List of callback functions that are applied at end of each iteration.
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Returns
-------
...
...
@@ -319,7 +320,8 @@ def cv(params, train_set, num_boost_round=10, nfold=5, stratified=False,
seed : int
Seed used to generate the folds (passed to numpy.random.seed).
callbacks : list of callback functions
List of callback functions that are applied at end of each iteration.
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Returns
-------
...
...
python-package/lightgbm/sklearn.py
View file @
bd7274ba
...
...
@@ -35,7 +35,7 @@ def _objective_function_wrapper(func):
Expects a callable with signature ``func(y_true, y_pred)`` or ``func(y_true, y_pred, group):
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class] (for multi-class)
y_pred: array_like of shape [n_samples] or shape[n_samples
* n_class] (for multi-class)
The predicted values
group: array_like
group/query data, used for ranking task
...
...
@@ -46,7 +46,7 @@ def _objective_function_wrapper(func):
The new objective function as expected by ``lightgbm.engine.train``.
The signature is ``new_func(preds, dataset)``:
preds: array_like, shape [n_samples] or shape[n_samples* n_class]
preds: array_like, shape [n_samples] or shape[n_samples
* n_class]
The predicted values
dataset: ``dataset``
The training set from which the labels will be extracted using
...
...
@@ -97,7 +97,7 @@ def _eval_function_wrapper(func):
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class] (for multi-class)
y_pred: array_like of shape [n_samples] or shape[n_samples
* n_class] (for multi-class)
The predicted values
weight: array_like of shape [n_samples]
The weight of samples
...
...
@@ -110,7 +110,7 @@ def _eval_function_wrapper(func):
The new eval function as expected by ``lightgbm.engine.train``.
The signature is ``new_func(preds, dataset)``:
preds: array_like, shape [n_samples] or shape[n_samples* n_class]
preds: array_like, shape [n_samples] or shape[n_samples
* n_class]
The predicted values
dataset: ``dataset``
The training set from which the labels will be extracted using
...
...
@@ -209,13 +209,13 @@ class LGBMModel(LGBMModelBase):
y_true: array_like of shape [n_samples]
The target values
y_pred: array_like of shape [n_samples] or shape[n_samples* n_class]
y_pred: array_like of shape [n_samples] or shape[n_samples
* n_class]
The predicted values
group: array_like
group/query data, used for ranking task
grad: array_like of shape [n_samples] or shape[n_samples* n_class]
grad: array_like of shape [n_samples] or shape[n_samples
* n_class]
The value of the gradient for each sample point.
hess: array_like of shape [n_samples] or shape[n_samples* n_class]
hess: array_like of shape [n_samples] or shape[n_samples
* n_class]
The value of the second derivative for each sample point
for multi-class task, the y_pred is group by class_id first, then group by row_id
...
...
@@ -276,7 +276,8 @@ class LGBMModel(LGBMModelBase):
eval_init_score
=
None
,
eval_group
=
None
,
eval_metric
=
None
,
early_stopping_rounds
=
None
,
verbose
=
True
,
feature_name
=
None
,
categorical_feature
=
None
):
feature_name
=
None
,
categorical_feature
=
None
,
callbacks
=
None
):
"""
Fit the gradient boosting model
...
...
@@ -312,6 +313,9 @@ class LGBMModel(LGBMModelBase):
Categorical features,
type int represents index,
type str represents feature names (need to specify feature_name as well)
callbacks : list of callback functions
List of callback functions that are applied at each iteration.
See Callbacks in Python-API.md for more information.
Note
----
...
...
@@ -398,7 +402,8 @@ class LGBMModel(LGBMModelBase):
early_stopping_rounds
=
early_stopping_rounds
,
evals_result
=
evals_result
,
fobj
=
self
.
fobj
,
feval
=
feval
,
verbose_eval
=
verbose
,
feature_name
=
feature_name
,
categorical_feature
=
categorical_feature
)
categorical_feature
=
categorical_feature
,
callbacks
=
callbacks
)
if
evals_result
:
for
val
in
evals_result
.
items
():
...
...
@@ -525,7 +530,8 @@ class LGBMClassifier(LGBMModel, LGBMClassifierBase):
eval_init_score
=
None
,
eval_metric
=
"binary_logloss"
,
early_stopping_rounds
=
None
,
verbose
=
True
,
feature_name
=
None
,
categorical_feature
=
None
):
feature_name
=
None
,
categorical_feature
=
None
,
callbacks
=
None
):
self
.
_le
=
LGBMLabelEncoder
().
fit
(
y
)
y
=
self
.
_le
.
transform
(
y
)
...
...
@@ -547,7 +553,8 @@ class LGBMClassifier(LGBMModel, LGBMClassifierBase):
eval_metric
=
eval_metric
,
early_stopping_rounds
=
early_stopping_rounds
,
verbose
=
verbose
,
feature_name
=
feature_name
,
categorical_feature
=
categorical_feature
)
categorical_feature
=
categorical_feature
,
callbacks
=
callbacks
)
return
self
def
predict
(
self
,
data
,
raw_score
=
False
,
num_iteration
=
0
):
...
...
@@ -616,7 +623,8 @@ class LGBMRanker(LGBMModel):
eval_init_score
=
None
,
eval_group
=
None
,
eval_metric
=
'ndcg'
,
eval_at
=
1
,
early_stopping_rounds
=
None
,
verbose
=
True
,
feature_name
=
None
,
categorical_feature
=
None
):
feature_name
=
None
,
categorical_feature
=
None
,
callbacks
=
None
):
"""
Most arguments like common methods except following:
...
...
@@ -633,10 +641,9 @@ class LGBMRanker(LGBMModel):
raise
ValueError
(
"Eval_group cannot be None when eval_set is not None"
)
elif
len
(
eval_group
)
!=
len
(
eval_set
):
raise
ValueError
(
"Length of eval_group should equal to eval_set"
)
else
:
for
inner_group
in
eval_group
:
if
inner_group
is
None
:
raise
ValueError
(
"Should set group for all eval dataset for ranking task"
)
elif
(
isinstance
(
eval_group
,
dict
)
and
any
(
i
not
in
eval_group
or
eval_group
[
i
]
is
None
for
i
in
range
(
len
(
eval_group
))))
\
or
(
isinstance
(
eval_group
,
list
)
and
any
(
group
is
None
for
group
in
eval_group
)):
raise
ValueError
(
"Should set group for all eval dataset for ranking task; if you use dict, the index should start from 0"
)
if
eval_at
is
not
None
:
self
.
eval_at
=
eval_at
...
...
@@ -647,5 +654,6 @@ class LGBMRanker(LGBMModel):
eval_metric
=
eval_metric
,
early_stopping_rounds
=
early_stopping_rounds
,
verbose
=
verbose
,
feature_name
=
feature_name
,
categorical_feature
=
categorical_feature
)
categorical_feature
=
categorical_feature
,
callbacks
=
callbacks
)
return
self
tests/python_package_test/test_sklearn.py
View file @
bd7274ba
...
...
@@ -43,7 +43,14 @@ class TestSklearn(unittest.TestCase):
X_train
,
y_train
=
load_svmlight_file
(
'../../examples/lambdarank/rank.train'
)
X_test
,
y_test
=
load_svmlight_file
(
'../../examples/lambdarank/rank.test'
)
q_train
=
np
.
loadtxt
(
'../../examples/lambdarank/rank.train.query'
)
lgb_model
=
lgb
.
LGBMRanker
().
fit
(
X_train
,
y_train
,
group
=
q_train
,
eval_at
=
[
1
])
q_test
=
np
.
loadtxt
(
'../../examples/lambdarank/rank.test.query'
)
lgb_model
=
lgb
.
LGBMRanker
().
fit
(
X_train
,
y_train
,
group
=
q_train
,
eval_set
=
[(
X_test
,
y_test
)],
eval_group
=
[
q_test
],
eval_at
=
[
1
],
verbose
=
False
,
callbacks
=
[
lgb
.
reset_parameter
(
learning_rate
=
lambda
x
:
0.95
**
x
*
0.1
)])
def
test_regression_with_custom_objective
(
self
):
def
objective_ls
(
y_true
,
y_pred
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment