Unverified Commit fffd066c authored by Yifei Liu's avatar Yifei Liu Committed by GitHub
Browse files

Decouple Boosting Types (fixes #3128) (#4827)



* add parameter data_sample_strategy

* abstract GOSS as a sample strategy(GOSS1), togetherwith origial GOSS (Normal Bagging has not been abstracted, so do NOT use it now)

* abstract Bagging as a subclass (BAGGING), but original Bagging members in GBDT are still kept

* fix some variables

* remove GOSS(as boost) and Bagging logic in GBDT

* rename GOSS1 to GOSS(as sample strategy)

* add warning about use GOSS as boosting_type

* a little ; bug

* remove CHECK when "gradients != nullptr"

* rename DataSampleStrategy to avoid confusion

* remove and add some ccomments, followingconvention

* fix bug about GBDT::ResetConfig (ObjectiveFunction inconsistencty bet…

* add std::ignore to avoid compiler warnings (anpotential fails)

* update Makevars and vcxproj

* handle constant hessian

move resize of gradient vectors out of sample strategy

* mark override for IsHessianChange

* fix lint errors

* rerun parameter_generator.py

* update config_auto.cpp

* delete redundant blank line

* update num_data_ when train_data_ is updated

set gradients and hessians when GOSS

* check bagging_freq is not zero

* reset config_ value

merge ResetBaggingConfig and ResetGOSS

* remove useless check

* add ttests in test_engine.py

* remove whitespace in blank line

* remove arguments verbose_eval and evals_result

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* Update src/boosting/sample_strategy.cpp

modify warning about setting goss as `boosting_type`
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

replace load_boston() with make_regression()

remove value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Update tests/python_package_test/test_engine.py

add value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Modify warnning about using goss as boosting type

* Update tests/python_package_test/test_engine.py

add random_state=42 for make_regression()

reduce the threshold of mean_square_error

* Update src/boosting/sample_strategy.cpp
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* remove goss from boosting types in documentation

* Update src/boosting/bagging.hpp
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>

* Update src/boosting/bagging.hpp
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>

* Update src/boosting/goss.hpp
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>

* Update src/boosting/goss.hpp
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>

* rename GOSS with GOSSStrategy

* update doc

* address comments

* fix table in doc

* Update include/LightGBM/config.h
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>

* update documentation

* update test case

* revert useless change in test_engine.py

* add tests for evaluation results in test_sample_strategy_with_boosting

* include <string>

* change to assert_allclose in test_goss_boosting_and_strategy_equivalent

* more tolerance in result checking, due to minor difference in results of gpu versions

* change == to np.testing.assert_allclose

* fix test case

* set gpu_use_dp to true

* change --report to --report-level for rstcheck

* use gpu_use_dp=true in test_goss_boosting_and_strategy_equivalent

* revert unexpected changes of non-ascii characters

* revert unexpected changes of non-ascii characters

* remove useless changes

* allocate gradients_pointer_ and hessians_pointer when necessary

* add spaces

* remove redundant virtual

* include <LightGBM/utils/log.h> for USE_CUDA

* check for  in test_goss_boosting_and_strategy_equivalent

* check for identity in test_sample_strategy_with_boosting

* remove cuda  option in test_sample_strategy_with_boosting

* Update tests/python_package_test/test_engine.py
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_engine.py
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after bagging

* remove useless code

* check objective_function_ instead of gradients

* enable rf with goss

simplify params in test cases

* remove useless changes

* allow rf with feature subsampling alone

* change position of ResetGradientBuffers

* check for dask

* add parameter types for data_sample_strategy
Co-authored-by: default avatarGuangda Liu <v-guangdaliu@microsoft.com>
Co-authored-by: default avatarYu Shi <shiyu_k1994@qq.com>
Co-authored-by: default avatarGuangdaLiu <90019144+GuangdaLiu@users.noreply.github.com>
Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
parent a2ae6b95
...@@ -114,8 +114,9 @@ def make_ranking(n_samples=100, n_features=20, n_informative=5, gmax=2, ...@@ -114,8 +114,9 @@ def make_ranking(n_samples=100, n_features=20, n_informative=5, gmax=2,
@lru_cache(maxsize=None) @lru_cache(maxsize=None)
def make_synthetic_regression(n_samples=100): def make_synthetic_regression(n_samples=100, n_features=4, n_informative=2, random_state=42):
return sklearn.datasets.make_regression(n_samples, n_features=4, n_informative=2, random_state=42) return sklearn.datasets.make_regression(n_samples=n_samples, n_features=n_features,
n_informative=n_informative, random_state=random_state)
def dummy_obj(preds, train_data): def dummy_obj(preds, train_data):
......
...@@ -253,6 +253,7 @@ ...@@ -253,6 +253,7 @@
<ClInclude Include="..\include\LightGBM\network.h" /> <ClInclude Include="..\include\LightGBM\network.h" />
<ClInclude Include="..\include\LightGBM\objective_function.h" /> <ClInclude Include="..\include\LightGBM\objective_function.h" />
<ClInclude Include="..\include\LightGBM\prediction_early_stop.h" /> <ClInclude Include="..\include\LightGBM\prediction_early_stop.h" />
<ClInclude Include="..\include\LightGBM\sample_strategy.h" />
<ClInclude Include="..\include\LightGBM\tree.h" /> <ClInclude Include="..\include\LightGBM\tree.h" />
<ClInclude Include="..\include\LightGBM\tree_learner.h" /> <ClInclude Include="..\include\LightGBM\tree_learner.h" />
<ClInclude Include="..\include\LightGBM\utils\yamc\alternate_shared_mutex.hpp" /> <ClInclude Include="..\include\LightGBM\utils\yamc\alternate_shared_mutex.hpp" />
...@@ -311,6 +312,7 @@ ...@@ -311,6 +312,7 @@
<ClCompile Include="..\src\boosting\gbdt_model_text.cpp" /> <ClCompile Include="..\src\boosting\gbdt_model_text.cpp" />
<ClCompile Include="..\src\boosting\gbdt_prediction.cpp" /> <ClCompile Include="..\src\boosting\gbdt_prediction.cpp" />
<ClCompile Include="..\src\boosting\prediction_early_stop.cpp" /> <ClCompile Include="..\src\boosting\prediction_early_stop.cpp" />
<ClCompile Include="..\src\boosting\sample_strategy.cpp" />
<ClCompile Include="..\src\c_api.cpp" /> <ClCompile Include="..\src\c_api.cpp" />
<ClCompile Include="..\src\io\bin.cpp" /> <ClCompile Include="..\src\io\bin.cpp" />
<ClCompile Include="..\src\io\config.cpp" /> <ClCompile Include="..\src\io\config.cpp" />
......
...@@ -129,6 +129,9 @@ ...@@ -129,6 +129,9 @@
<ClInclude Include="..\include\LightGBM\prediction_early_stop.h"> <ClInclude Include="..\include\LightGBM\prediction_early_stop.h">
<Filter>include\LightGBM</Filter> <Filter>include\LightGBM</Filter>
</ClInclude> </ClInclude>
<ClInclude Include="..\include\LightGBM\sample_strategy.h">
<Filter>include\LightGBM</Filter>
</ClInclude>
<ClInclude Include="..\include\LightGBM\tree.h"> <ClInclude Include="..\include\LightGBM\tree.h">
<Filter>include\LightGBM</Filter> <Filter>include\LightGBM</Filter>
</ClInclude> </ClInclude>
...@@ -311,6 +314,9 @@ ...@@ -311,6 +314,9 @@
<ClCompile Include="..\src\boosting\gbdt_model_text.cpp"> <ClCompile Include="..\src\boosting\gbdt_model_text.cpp">
<Filter>src\boosting</Filter> <Filter>src\boosting</Filter>
</ClCompile> </ClCompile>
<ClCompile Include="..\src\boosting\sample_strategy.cpp">
<Filter>src\boosting</Filter>
</ClCompile>
<ClCompile Include="..\src\io\file_io.cpp"> <ClCompile Include="..\src\io\file_io.cpp">
<Filter>src\io</Filter> <Filter>src\io</Filter>
</ClCompile> </ClCompile>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment