Unverified Commit fcfd4132 authored by Belinda Trotta's avatar Belinda Trotta Committed by GitHub
Browse files

Trees with linear models at leaves (#3299)

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs
parent d90a16d5
......@@ -112,6 +112,7 @@
<Optimization>Disabled</Optimization>
<RuntimeLibrary>MultiThreadedDebugDLL</RuntimeLibrary>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<AdditionalIncludeDirectories>$(ProjectDir)\..\eigen;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
</ClCompile>
<Link>
<AdditionalLibraryDirectories>
......@@ -134,6 +135,7 @@
<Optimization>Disabled</Optimization>
<RuntimeLibrary>MultiThreadedDebugDLL</RuntimeLibrary>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<AdditionalIncludeDirectories>$(ProjectDir)\..\eigen;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
</ClCompile>
<Link>
<AdditionalDependencies>
......@@ -153,6 +155,7 @@
<Optimization>Disabled</Optimization>
<RuntimeLibrary>MultiThreadedDebugDLL</RuntimeLibrary>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<AdditionalIncludeDirectories>$(ProjectDir)\..\eigen;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
</ClCompile>
<Link>
<AdditionalDependencies>
......@@ -175,6 +178,7 @@
<OmitFramePointers>true</OmitFramePointers>
<RuntimeLibrary>MultiThreadedDLL</RuntimeLibrary>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<AdditionalIncludeDirectories>$(ProjectDir)\..\eigen;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
</ClCompile>
<Link>
<AdditionalLibraryDirectories>
......@@ -201,6 +205,7 @@
<OmitFramePointers>true</OmitFramePointers>
<FunctionLevelLinking>true</FunctionLevelLinking>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<AdditionalIncludeDirectories>$(ProjectDir)\..\eigen;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
</ClCompile>
<Link>
<AdditionalDependencies />
......@@ -221,6 +226,7 @@
<OmitFramePointers>true</OmitFramePointers>
<FunctionLevelLinking>true</FunctionLevelLinking>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<AdditionalIncludeDirectories>$(ProjectDir)\..\eigen;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
</ClCompile>
<Link>
<AdditionalDependencies>
......@@ -287,6 +293,7 @@
<ClInclude Include="..\src\treelearner\data_partition.hpp" />
<ClInclude Include="..\src\treelearner\feature_histogram.hpp" />
<ClInclude Include="..\src\treelearner\leaf_splits.hpp" />
<ClInclude Include="..\src\treelearner\linear_tree_learner.h" />
<ClInclude Include="..\src\treelearner\monotone_constraints.hpp" />
<ClInclude Include="..\src\treelearner\parallel_tree_learner.h" />
<ClInclude Include="..\src\treelearner\serial_tree_learner.h" />
......@@ -321,6 +328,7 @@
<ClCompile Include="..\src\main.cpp" />
<ClCompile Include="..\src\treelearner\data_parallel_tree_learner.cpp" />
<ClCompile Include="..\src\treelearner\feature_parallel_tree_learner.cpp" />
<ClCompile Include="..\src\treelearner\linear_tree_learner.cpp" />
<ClCompile Include="..\src\treelearner\serial_tree_learner.cpp" />
<ClCompile Include="..\src\treelearner\tree_learner.cpp" />
<ClCompile Include="..\src\treelearner\voting_parallel_tree_learner.cpp" />
......@@ -328,4 +336,4 @@
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
</Project>
</Project>
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment