Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
tianlh
LightGBM-DCU
Commits
69c1c330
Unverified
Commit
69c1c330
authored
Dec 05, 2019
by
Nikita Titov
Committed by
GitHub
Dec 05, 2019
Browse files
[python][R-package] warn users about untransformed values in case of custom obj (#2611)
parent
61292080
Changes
6
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
42 additions
and
9 deletions
+42
-9
R-package/demo/cross_validation.R
R-package/demo/cross_validation.R
+8
-1
R-package/demo/early_stopping.R
R-package/demo/early_stopping.R
+4
-4
R-package/tests/testthat/test_custom_objective.R
R-package/tests/testthat/test_custom_objective.R
+6
-1
examples/python-guide/advanced_example.py
examples/python-guide/advanced_example.py
+10
-0
python-package/lightgbm/sklearn.py
python-package/lightgbm/sklearn.py
+9
-2
tests/python_package_test/test_sklearn.py
tests/python_package_test/test_sklearn.py
+5
-1
No files found.
R-package/demo/cross_validation.R
View file @
69c1c330
...
@@ -47,9 +47,16 @@ logregobj <- function(preds, dtrain) {
...
@@ -47,9 +47,16 @@ logregobj <- function(preds, dtrain) {
hess
<-
preds
*
(
1.0
-
preds
)
hess
<-
preds
*
(
1.0
-
preds
)
return
(
list
(
grad
=
grad
,
hess
=
hess
))
return
(
list
(
grad
=
grad
,
hess
=
hess
))
}
}
# User-defined evaluation function returns a pair (metric_name, result, higher_better)
# NOTE: when you do customized loss function, the default prediction value is margin
# This may make built-in evalution metric calculate wrong results
# For example, we are doing logistic loss, the prediction is score before logistic transformation
# Keep this in mind when you use the customization, and maybe you need write customized evaluation function
evalerror
<-
function
(
preds
,
dtrain
)
{
evalerror
<-
function
(
preds
,
dtrain
)
{
labels
<-
getinfo
(
dtrain
,
"label"
)
labels
<-
getinfo
(
dtrain
,
"label"
)
err
<-
as.numeric
(
sum
(
labels
!=
(
preds
>
0.0
)))
/
length
(
labels
)
preds
<-
1.0
/
(
1.0
+
exp
(
-
preds
))
err
<-
as.numeric
(
sum
(
labels
!=
(
preds
>
0.5
)))
/
length
(
labels
)
return
(
list
(
name
=
"error"
,
value
=
err
,
higher_better
=
FALSE
))
return
(
list
(
name
=
"error"
,
value
=
err
,
higher_better
=
FALSE
))
}
}
...
...
R-package/demo/early_stopping.R
View file @
69c1c330
...
@@ -28,12 +28,12 @@ logregobj <- function(preds, dtrain) {
...
@@ -28,12 +28,12 @@ logregobj <- function(preds, dtrain) {
return
(
list
(
grad
=
grad
,
hess
=
hess
))
return
(
list
(
grad
=
grad
,
hess
=
hess
))
}
}
# User
defined evaluation function
,
return a pair metric_name, result, higher_better
# User
-
defined evaluation function return
s
a pair
(
metric_name, result, higher_better
)
# NOTE: when you do customized loss function, the default prediction value is margin
# NOTE: when you do customized loss function, the default prediction value is margin
# This may make buil
d
in evalution metric
not function properly
# This may make buil
t-
in evalution metric
calculate wrong results
# For example, we are doing logistic loss, the prediction is score before logistic transformation
# For example, we are doing logistic loss, the prediction is score before logistic transformation
# The buil
d
in evaluation error assumes input is after logistic transformation
# The buil
t-
in evaluation error assumes input is after logistic transformation
#
Take
this in mind when you use the customization, and maybe you need write customized evaluation function
#
Keep
this in mind when you use the customization, and maybe you need write customized evaluation function
evalerror
<-
function
(
preds
,
dtrain
)
{
evalerror
<-
function
(
preds
,
dtrain
)
{
labels
<-
getinfo
(
dtrain
,
"label"
)
labels
<-
getinfo
(
dtrain
,
"label"
)
err
<-
as.numeric
(
sum
(
labels
!=
(
preds
>
0.5
)))
/
length
(
labels
)
err
<-
as.numeric
(
sum
(
labels
!=
(
preds
>
0.5
)))
/
length
(
labels
)
...
...
R-package/tests/testthat/test_custom_objective.R
View file @
69c1c330
...
@@ -14,9 +14,14 @@ logregobj <- function(preds, dtrain) {
...
@@ -14,9 +14,14 @@ logregobj <- function(preds, dtrain) {
return
(
list
(
grad
=
grad
,
hess
=
hess
))
return
(
list
(
grad
=
grad
,
hess
=
hess
))
}
}
# User-defined evaluation function returns a pair (metric_name, result, higher_better)
# NOTE: when you do customized loss function, the default prediction value is margin
# This may make built-in evalution metric calculate wrong results
# Keep this in mind when you use the customization, and maybe you need write customized evaluation function
evalerror
<-
function
(
preds
,
dtrain
)
{
evalerror
<-
function
(
preds
,
dtrain
)
{
labels
<-
getinfo
(
dtrain
,
"label"
)
labels
<-
getinfo
(
dtrain
,
"label"
)
err
<-
as.numeric
(
sum
(
labels
!=
(
preds
>
0.0
)))
/
length
(
labels
)
preds
<-
1.0
/
(
1.0
+
exp
(
-
preds
))
err
<-
as.numeric
(
sum
(
labels
!=
(
preds
>
0.5
)))
/
length
(
labels
)
return
(
list
(
return
(
list
(
name
=
"error"
name
=
"error"
,
value
=
err
,
value
=
err
...
...
examples/python-guide/advanced_example.py
View file @
69c1c330
...
@@ -147,8 +147,13 @@ def loglikelihood(preds, train_data):
...
@@ -147,8 +147,13 @@ def loglikelihood(preds, train_data):
# self-defined eval metric
# self-defined eval metric
# f(preds: array, train_data: Dataset) -> name: string, eval_result: float, is_higher_better: bool
# f(preds: array, train_data: Dataset) -> name: string, eval_result: float, is_higher_better: bool
# binary error
# binary error
# NOTE: when you do customized loss function, the default prediction value is margin
# This may make built-in evalution metric calculate wrong results
# For example, we are doing log likelihood loss, the prediction is score before logistic transformation
# Keep this in mind when you use the customization
def
binary_error
(
preds
,
train_data
):
def
binary_error
(
preds
,
train_data
):
labels
=
train_data
.
get_label
()
labels
=
train_data
.
get_label
()
preds
=
1.
/
(
1.
+
np
.
exp
(
-
preds
))
return
'error'
,
np
.
mean
(
labels
!=
(
preds
>
0.5
)),
False
return
'error'
,
np
.
mean
(
labels
!=
(
preds
>
0.5
)),
False
...
@@ -166,8 +171,13 @@ print('Finished 40 - 50 rounds with self-defined objective function and eval met
...
@@ -166,8 +171,13 @@ print('Finished 40 - 50 rounds with self-defined objective function and eval met
# another self-defined eval metric
# another self-defined eval metric
# f(preds: array, train_data: Dataset) -> name: string, eval_result: float, is_higher_better: bool
# f(preds: array, train_data: Dataset) -> name: string, eval_result: float, is_higher_better: bool
# accuracy
# accuracy
# NOTE: when you do customized loss function, the default prediction value is margin
# This may make built-in evalution metric calculate wrong results
# For example, we are doing log likelihood loss, the prediction is score before logistic transformation
# Keep this in mind when you use the customization
def
accuracy
(
preds
,
train_data
):
def
accuracy
(
preds
,
train_data
):
labels
=
train_data
.
get_label
()
labels
=
train_data
.
get_label
()
preds
=
1.
/
(
1.
+
np
.
exp
(
-
preds
))
return
'accuracy'
,
np
.
mean
(
labels
==
(
preds
>
0.5
)),
True
return
'accuracy'
,
np
.
mean
(
labels
==
(
preds
>
0.5
)),
True
...
...
python-package/lightgbm/sklearn.py
View file @
69c1c330
...
@@ -2,6 +2,8 @@
...
@@ -2,6 +2,8 @@
"""Scikit-learn wrapper interface for LightGBM."""
"""Scikit-learn wrapper interface for LightGBM."""
from
__future__
import
absolute_import
from
__future__
import
absolute_import
import
warnings
import
numpy
as
np
import
numpy
as
np
from
.basic
import
Dataset
,
LightGBMError
,
_ConfigAliases
from
.basic
import
Dataset
,
LightGBMError
,
_ConfigAliases
...
@@ -812,7 +814,7 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
...
@@ -812,7 +814,7 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
"""Docstring is inherited from the LGBMModel."""
"""Docstring is inherited from the LGBMModel."""
result
=
self
.
predict_proba
(
X
,
raw_score
,
num_iteration
,
result
=
self
.
predict_proba
(
X
,
raw_score
,
num_iteration
,
pred_leaf
,
pred_contrib
,
**
kwargs
)
pred_leaf
,
pred_contrib
,
**
kwargs
)
if
raw_score
or
pred_leaf
or
pred_contrib
:
if
callable
(
self
.
_objective
)
or
raw_score
or
pred_leaf
or
pred_contrib
:
return
result
return
result
else
:
else
:
class_index
=
np
.
argmax
(
result
,
axis
=
1
)
class_index
=
np
.
argmax
(
result
,
axis
=
1
)
...
@@ -861,7 +863,12 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
...
@@ -861,7 +863,12 @@ class LGBMClassifier(LGBMModel, _LGBMClassifierBase):
"""
"""
result
=
super
(
LGBMClassifier
,
self
).
predict
(
X
,
raw_score
,
num_iteration
,
result
=
super
(
LGBMClassifier
,
self
).
predict
(
X
,
raw_score
,
num_iteration
,
pred_leaf
,
pred_contrib
,
**
kwargs
)
pred_leaf
,
pred_contrib
,
**
kwargs
)
if
self
.
_n_classes
>
2
or
raw_score
or
pred_leaf
or
pred_contrib
:
if
callable
(
self
.
_objective
)
and
not
(
raw_score
or
pred_leaf
or
pred_contrib
):
warnings
.
warn
(
"Cannot compute class probabilities or labels "
"due to the usage of customized objective function.
\n
"
"Returning raw scores instead."
)
return
result
elif
self
.
_n_classes
>
2
or
raw_score
or
pred_leaf
or
pred_contrib
:
return
result
return
result
else
:
else
:
return
np
.
vstack
((
1.
-
result
,
result
)).
transpose
()
return
np
.
vstack
((
1.
-
result
,
result
)).
transpose
()
...
...
tests/python_package_test/test_sklearn.py
View file @
69c1c330
...
@@ -131,7 +131,11 @@ class TestSklearn(unittest.TestCase):
...
@@ -131,7 +131,11 @@ class TestSklearn(unittest.TestCase):
X_train
,
X_test
,
y_train
,
y_test
=
train_test_split
(
X
,
y
,
test_size
=
0.1
,
random_state
=
42
)
X_train
,
X_test
,
y_train
,
y_test
=
train_test_split
(
X
,
y
,
test_size
=
0.1
,
random_state
=
42
)
gbm
=
lgb
.
LGBMClassifier
(
n_estimators
=
50
,
silent
=
True
,
objective
=
logregobj
)
gbm
=
lgb
.
LGBMClassifier
(
n_estimators
=
50
,
silent
=
True
,
objective
=
logregobj
)
gbm
.
fit
(
X_train
,
y_train
,
eval_set
=
[(
X_test
,
y_test
)],
early_stopping_rounds
=
5
,
verbose
=
False
)
gbm
.
fit
(
X_train
,
y_train
,
eval_set
=
[(
X_test
,
y_test
)],
early_stopping_rounds
=
5
,
verbose
=
False
)
ret
=
binary_error
(
y_test
,
gbm
.
predict
(
X_test
))
# prediction result is actually not transformed (is raw) due to custom objective
y_pred_raw
=
gbm
.
predict_proba
(
X_test
)
self
.
assertFalse
(
np
.
all
(
y_pred_raw
>=
0
))
y_pred
=
1.0
/
(
1.0
+
np
.
exp
(
-
y_pred_raw
))
ret
=
binary_error
(
y_test
,
y_pred
)
self
.
assertLess
(
ret
,
0.05
)
self
.
assertLess
(
ret
,
0.05
)
def
test_dart
(
self
):
def
test_dart
(
self
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment