Unverified Commit 505a145f authored by CharlesAuguste's avatar CharlesAuguste Committed by GitHub
Browse files

Pr3 monotone constraints splits penalization (#2939)



* Add the monotone penalty parameter to the config.

* Pass tree in the necessary functions so it can be used in ComputeBestSplitForFeature.

* Add monotone penalty.

* Added link to the original report.

* Add tests.

* Fix GPU.

* Revert "Pass tree in the necessary functions so it can be used in ComputeBestSplitForFeature."

This reverts commit 37757e8e8f3a2c82a604f4af9a926da616660d2e.

* Revert "Fix GPU."

This reverts commit e49eeee41c883f3c97fd5cdbd53c9288094bffb6.

* Added a shared pointer to the tree so the constraints can use it too.

* Moved check on monotone penalty to config.cpp.

* Python linting.

* Use AssertTrue instead of assert_.

* Fix penalization in test.

* Make GPU deterministic in tests.

* Rename tree to tree_ in monotone constraints.

* Replaced epsilon by kEplison.

* Typo.

* Make tree pointer const.

* Update src/treelearner/monotone_constraints.hpp
Co-Authored-By: default avatarGuolin Ke <guolin.ke@outlook.com>

* Update src/treelearner/monotone_constraints.hpp
Co-Authored-By: default avatarGuolin Ke <guolin.ke@outlook.com>

* Added alias for the penalty.

* Remove useless comment.

* Save CI time.

* Refactor test_monotone_penalty_max.

* Update include/LightGBM/config.h
Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>

* Fix doc to be in line with previous config change commit.
Co-authored-by: default avatarCharles Auguste <auguste@dubquantdev801.ire.susq.com>
Co-authored-by: default avatarGuolin Ke <guolin.ke@outlook.com>
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
parent 91ce04b6
...@@ -470,6 +470,14 @@ Learning Control Parameters ...@@ -470,6 +470,14 @@ Learning Control Parameters
- ``intermediate``, a `more advanced method <https://github.com/microsoft/LightGBM/files/3457826/PR-monotone-constraints-report.pdf>`__, which may slow the library very slightly. However, this method is much less constraining than the basic method and should significantly improve the results - ``intermediate``, a `more advanced method <https://github.com/microsoft/LightGBM/files/3457826/PR-monotone-constraints-report.pdf>`__, which may slow the library very slightly. However, this method is much less constraining than the basic method and should significantly improve the results
- ``monotone_penalty`` :raw-html:`<a id="monotone_penalty" title="Permalink to this parameter" href="#monotone_penalty">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``monotone_splits_penalty``, ``ms_penalty``, ``mc_penalty``, constraints: ``monotone_penalty >= 0.0``
- used only if ``monotone_constraints`` is set
- `monotone penalty <https://github.com/microsoft/LightGBM/files/3457826/PR-monotone-constraints-report.pdf>`__: a penalization parameter X forbids any monotone splits on the first X (rounded down) level(s) of the tree. The penalty applied to monotone splits on a given depth is a continuous, increasing function the penalization parameter
- if ``0.0`` (the default), no penalization is applied
- ``feature_contri`` :raw-html:`<a id="feature_contri" title="Permalink to this parameter" href="#feature_contri">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-double, aliases: ``feature_contrib``, ``fc``, ``fp``, ``feature_penalty`` - ``feature_contri`` :raw-html:`<a id="feature_contri" title="Permalink to this parameter" href="#feature_contri">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = multi-double, aliases: ``feature_contrib``, ``fc``, ``fp``, ``feature_penalty``
- used to control feature's split gain, will use ``gain[i] = max(0, feature_contri[i]) * gain[i]`` to replace the split gain of i-th feature - used to control feature's split gain, will use ``gain[i] = max(0, feature_contri[i]) * gain[i]`` to replace the split gain of i-th feature
......
...@@ -447,6 +447,13 @@ struct Config { ...@@ -447,6 +447,13 @@ struct Config {
// descl2 = ``intermediate``, a `more advanced method <https://github.com/microsoft/LightGBM/files/3457826/PR-monotone-constraints-report.pdf>`__, which may slow the library very slightly. However, this method is much less constraining than the basic method and should significantly improve the results // descl2 = ``intermediate``, a `more advanced method <https://github.com/microsoft/LightGBM/files/3457826/PR-monotone-constraints-report.pdf>`__, which may slow the library very slightly. However, this method is much less constraining than the basic method and should significantly improve the results
std::string monotone_constraints_method = "basic"; std::string monotone_constraints_method = "basic";
// alias = monotone_splits_penalty, ms_penalty, mc_penalty
// check = >=0.0
// desc = used only if ``monotone_constraints`` is set
// desc = `monotone penalty <https://github.com/microsoft/LightGBM/files/3457826/PR-monotone-constraints-report.pdf>`__: a penalization parameter X forbids any monotone splits on the first X (rounded down) level(s) of the tree. The penalty applied to monotone splits on a given depth is a continuous, increasing function the penalization parameter
// desc = if ``0.0`` (the default), no penalization is applied
double monotone_penalty = 0.0;
// type = multi-double // type = multi-double
// alias = feature_contrib, fc, fp, feature_penalty // alias = feature_contrib, fc, fp, feature_penalty
// default = None // default = None
......
...@@ -328,6 +328,9 @@ void Config::CheckParamConflict() { ...@@ -328,6 +328,9 @@ void Config::CheckParamConflict() {
Log::Warning("Cannot use \"intermediate\" monotone constraints with feature fraction different from 1, auto set monotone constraints to \"basic\" method."); Log::Warning("Cannot use \"intermediate\" monotone constraints with feature fraction different from 1, auto set monotone constraints to \"basic\" method.");
monotone_constraints_method = "basic"; monotone_constraints_method = "basic";
} }
if (max_depth > 0 && monotone_penalty >= max_depth) {
Log::Warning("Monotone penalty greater than tree depth. Monotone features won't be used.");
}
} }
std::string Config::ToString() const { std::string Config::ToString() const {
......
...@@ -87,6 +87,9 @@ const std::unordered_map<std::string, std::string>& Config::alias_table() { ...@@ -87,6 +87,9 @@ const std::unordered_map<std::string, std::string>& Config::alias_table() {
{"monotone_constraint", "monotone_constraints"}, {"monotone_constraint", "monotone_constraints"},
{"monotone_constraining_method", "monotone_constraints_method"}, {"monotone_constraining_method", "monotone_constraints_method"},
{"mc_method", "monotone_constraints_method"}, {"mc_method", "monotone_constraints_method"},
{"monotone_splits_penalty", "monotone_penalty"},
{"ms_penalty", "monotone_penalty"},
{"mc_penalty", "monotone_penalty"},
{"feature_contrib", "feature_contri"}, {"feature_contrib", "feature_contri"},
{"fc", "feature_contri"}, {"fc", "feature_contri"},
{"fp", "feature_contri"}, {"fp", "feature_contri"},
...@@ -218,6 +221,7 @@ const std::unordered_set<std::string>& Config::parameter_set() { ...@@ -218,6 +221,7 @@ const std::unordered_set<std::string>& Config::parameter_set() {
"top_k", "top_k",
"monotone_constraints", "monotone_constraints",
"monotone_constraints_method", "monotone_constraints_method",
"monotone_penalty",
"feature_contri", "feature_contri",
"forcedsplits_filename", "forcedsplits_filename",
"refit_decay_rate", "refit_decay_rate",
...@@ -419,6 +423,9 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str ...@@ -419,6 +423,9 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str
GetString(params, "monotone_constraints_method", &monotone_constraints_method); GetString(params, "monotone_constraints_method", &monotone_constraints_method);
GetDouble(params, "monotone_penalty", &monotone_penalty);
CHECK_GE(monotone_penalty, 0.0);
if (GetString(params, "feature_contri", &tmp_str)) { if (GetString(params, "feature_contri", &tmp_str)) {
feature_contri = Common::StringToArray<double>(tmp_str, ','); feature_contri = Common::StringToArray<double>(tmp_str, ',');
} }
...@@ -639,6 +646,7 @@ std::string Config::SaveMembersToString() const { ...@@ -639,6 +646,7 @@ std::string Config::SaveMembersToString() const {
str_buf << "[top_k: " << top_k << "]\n"; str_buf << "[top_k: " << top_k << "]\n";
str_buf << "[monotone_constraints: " << Common::Join(Common::ArrayCast<int8_t, int>(monotone_constraints), ",") << "]\n"; str_buf << "[monotone_constraints: " << Common::Join(Common::ArrayCast<int8_t, int>(monotone_constraints), ",") << "]\n";
str_buf << "[monotone_constraints_method: " << monotone_constraints_method << "]\n"; str_buf << "[monotone_constraints_method: " << monotone_constraints_method << "]\n";
str_buf << "[monotone_penalty: " << monotone_penalty << "]\n";
str_buf << "[feature_contri: " << Common::Join(feature_contri, ",") << "]\n"; str_buf << "[feature_contri: " << Common::Join(feature_contri, ",") << "]\n";
str_buf << "[forcedsplits_filename: " << forcedsplits_filename << "]\n"; str_buf << "[forcedsplits_filename: " << forcedsplits_filename << "]\n";
str_buf << "[refit_decay_rate: " << refit_decay_rate << "]\n"; str_buf << "[refit_decay_rate: " << refit_decay_rate << "]\n";
......
...@@ -62,6 +62,24 @@ class LeafConstraintsBase { ...@@ -62,6 +62,24 @@ class LeafConstraintsBase {
const std::vector<SplitInfo>& best_split_per_leaf) = 0; const std::vector<SplitInfo>& best_split_per_leaf) = 0;
inline static LeafConstraintsBase* Create(const Config* config, int num_leaves); inline static LeafConstraintsBase* Create(const Config* config, int num_leaves);
double ComputeMonotoneSplitGainPenalty(int leaf_index, double penalization) {
int depth = tree_->leaf_depth(leaf_index);
if (penalization >= depth + 1.) {
return kEpsilon;
}
if (penalization <= 1.) {
return 1. - penalization / pow(2., depth) + kEpsilon;
}
return 1. - pow(2, penalization - 1. - depth) + kEpsilon;
}
void ShareTreePointer(const Tree* tree) {
tree_ = tree;
}
private:
const Tree* tree_;
}; };
class BasicLeafConstraints : public LeafConstraintsBase { class BasicLeafConstraints : public LeafConstraintsBase {
......
...@@ -165,6 +165,8 @@ Tree* SerialTreeLearner::Train(const score_t* gradients, const score_t *hessians ...@@ -165,6 +165,8 @@ Tree* SerialTreeLearner::Train(const score_t* gradients, const score_t *hessians
auto tree = std::unique_ptr<Tree>(new Tree(config_->num_leaves)); auto tree = std::unique_ptr<Tree>(new Tree(config_->num_leaves));
auto tree_prt = tree.get(); auto tree_prt = tree.get();
constraints_->ShareTreePointer(tree_prt);
// root leaf // root leaf
int left_leaf = 0; int left_leaf = 0;
int cur_depth = 1; int cur_depth = 1;
...@@ -692,6 +694,11 @@ void SerialTreeLearner::ComputeBestSplitForFeature( ...@@ -692,6 +694,11 @@ void SerialTreeLearner::ComputeBestSplitForFeature(
cegb_->DetlaGain(feature_index, real_fidx, leaf_splits->leaf_index(), cegb_->DetlaGain(feature_index, real_fidx, leaf_splits->leaf_index(),
num_data, new_split); num_data, new_split);
} }
if (new_split.monotone_type != 0) {
double penalty = constraints_->ComputeMonotoneSplitGainPenalty(
leaf_splits->leaf_index(), config_->monotone_penalty);
new_split.gain *= penalty;
}
if (new_split > *best_split) { if (new_split > *best_split) {
*best_split = new_split; *best_split = new_split;
} }
......
...@@ -1036,7 +1036,7 @@ class TestEngine(unittest.TestCase): ...@@ -1036,7 +1036,7 @@ class TestEngine(unittest.TestCase):
categorical_features = [] categorical_features = []
if x3_to_category: if x3_to_category:
categorical_features = [2] categorical_features = [2]
trainset = lgb.Dataset(x, label=y, categorical_feature=categorical_features) trainset = lgb.Dataset(x, label=y, categorical_feature=categorical_features, free_raw_data=False)
return trainset return trainset
def test_monotone_constraints(self): def test_monotone_constraints(self):
...@@ -1071,8 +1071,8 @@ class TestEngine(unittest.TestCase): ...@@ -1071,8 +1071,8 @@ class TestEngine(unittest.TestCase):
return True return True
for test_with_categorical_variable in [True, False]: for test_with_categorical_variable in [True, False]:
trainset = self.generate_trainset_for_monotone_constraints_tests(test_with_categorical_variable)
for monotone_constraints_method in ["basic", "intermediate"]: for monotone_constraints_method in ["basic", "intermediate"]:
trainset = self.generate_trainset_for_monotone_constraints_tests(test_with_categorical_variable)
params = { params = {
'min_data': 20, 'min_data': 20,
'num_leaves': 20, 'num_leaves': 20,
...@@ -1083,6 +1083,76 @@ class TestEngine(unittest.TestCase): ...@@ -1083,6 +1083,76 @@ class TestEngine(unittest.TestCase):
constrained_model = lgb.train(params, trainset) constrained_model = lgb.train(params, trainset)
self.assertTrue(is_correctly_constrained(constrained_model, test_with_categorical_variable)) self.assertTrue(is_correctly_constrained(constrained_model, test_with_categorical_variable))
def test_monotone_penalty(self):
def are_first_splits_non_monotone(tree, n, monotone_constraints):
if n <= 0:
return True
if "leaf_value" in tree:
return True
if monotone_constraints[tree["split_feature"]] != 0:
return False
return (are_first_splits_non_monotone(tree["left_child"], n - 1, monotone_constraints)
and are_first_splits_non_monotone(tree["right_child"], n - 1, monotone_constraints))
def are_there_monotone_splits(tree, monotone_constraints):
if "leaf_value" in tree:
return False
if monotone_constraints[tree["split_feature"]] != 0:
return True
return (are_there_monotone_splits(tree["left_child"], monotone_constraints)
or are_there_monotone_splits(tree["right_child"], monotone_constraints))
max_depth = 5
monotone_constraints = [1, -1, 0]
penalization_parameter = 2.0
trainset = self.generate_trainset_for_monotone_constraints_tests(x3_to_category=False)
for monotone_constraints_method in ["basic", "intermediate"]:
params = {
'max_depth': max_depth,
'monotone_constraints': monotone_constraints,
'monotone_penalty': penalization_parameter,
"monotone_constraints_method": monotone_constraints_method,
}
constrained_model = lgb.train(params, trainset, 10)
dumped_model = constrained_model.dump_model()["tree_info"]
for tree in dumped_model:
self.assertTrue(are_first_splits_non_monotone(tree["tree_structure"], int(penalization_parameter),
monotone_constraints))
self.assertTrue(are_there_monotone_splits(tree["tree_structure"], monotone_constraints))
# test if a penalty as high as the depth indeed prohibits all monotone splits
def test_monotone_penalty_max(self):
max_depth = 5
monotone_constraints = [1, -1, 0]
penalization_parameter = max_depth
trainset_constrained_model = self.generate_trainset_for_monotone_constraints_tests(x3_to_category=False)
x = trainset_constrained_model.data
y = trainset_constrained_model.label
x3_negatively_correlated_with_y = x[:, 2]
trainset_unconstrained_model = lgb.Dataset(x3_negatively_correlated_with_y.reshape(-1, 1), label=y)
params_constrained_model = {
'monotone_constraints': monotone_constraints,
'monotone_penalty': penalization_parameter,
"max_depth": max_depth,
"gpu_use_dp": True,
}
params_unconstrained_model = {
"max_depth": max_depth,
"gpu_use_dp": True,
}
unconstrained_model = lgb.train(params_unconstrained_model, trainset_unconstrained_model, 10)
unconstrained_model_predictions = unconstrained_model.\
predict(x3_negatively_correlated_with_y.reshape(-1, 1))
for monotone_constraints_method in ["basic", "intermediate"]:
params_constrained_model["monotone_constraints_method"] = monotone_constraints_method
# The penalization is so high that the first 2 features should not be used here
constrained_model = lgb.train(params_constrained_model, trainset_constrained_model, 10)
# Check that a very high penalization is the same as not using the features at all
np.testing.assert_array_equal(constrained_model.predict(x), unconstrained_model_predictions)
def test_max_bin_by_feature(self): def test_max_bin_by_feature(self):
col1 = np.arange(0, 100)[:, np.newaxis] col1 = np.arange(0, 100)[:, np.newaxis]
col2 = np.zeros((100, 1)) col2 = np.zeros((100, 1))
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment