Commits · 6cb968af2eecd79f9a1b78b2f5db3b5acf75d515 · tianlh / LightGBM-DCU

29 Dec, 2020 1 commit

[python-package] remove unused Eigen files, compile with EIGEN_MPL2_ONLY (fixes #3684) (#3685) · 6cb968af

James Lamb authored Dec 29, 2020



* [python-package] remove unused Eigen files (fixes #3684)

* more changes

* add EIGEN_MPL2_ONLY in VS solution file

* fix VS project

* remove EIGEN_MPL2_ONLY define in linear_tree_learner
Co-authored-by: Nikita Titov <nekit94-12@hotmail.com>

6cb968af

24 Dec, 2020 1 commit

Trees with linear models at leaves (#3299) · fcfd4132

Belinda Trotta authored Dec 24, 2020

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs

fcfd4132

11 Dec, 2020 1 commit
- [python-package] remove unnecessary files to reduce sdist size (#3639) · d1fe7090
  James Lamb authored Dec 11, 2020
```
* cut size

* more size cuts

* testing install

* fmt is header-only
```
  d1fe7090
08 Dec, 2020 1 commit

Fix model locale issue and improve model R/W performance. (#3405) · 792c9303

Alberto Ferreira authored Dec 08, 2020

* Fix LightGBM models locale sensitivity and improve R/W performance.

When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    https://github.com/microsoft/LightGBM/issues/2979 with the previous
    approach https://github.com/microsoft/LightGBM/pull/2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - https://github.com/microsoft/LightGBM/issues/2500
 - https://github.com/microsoft/LightGBM/issues/2890
 - https://github.com/ninia/jep/issues/205

 (as it is related to LGBM as well)

* Align CommonC namespace

* Add new external_libs/ to python setup

* Try fast_double_parser fix #1

Testing commit e09e5aad828bcb16bea7ed0ed8322e019112fdbe

If it works it should fix more LGBM builds

* CMake: Attempt to link fmt without explicit PUBLIC tag

* Exclude external_libs from linting

* Add exernal_libs to MANIFEST.in

* Set dynamic linking option for fmt.

* linting issues

* Try to fix lint includes

* Try to pass fPIC with static fmt lib

* Try CMake P_I_C option with fmt library

* [R-package] Add CMake support for R and CRAN

* Cleanup CMakeLists

* Try fmt hack to remove stdout

* Switch to header-only mode

* Add PRIVATE argument to target_link_libraries

* use fmt in header-only mode

* Remove CMakeLists comment

* Change OpenMP to PUBLIC linking in Mac

* Update fmt submodule to 7.1.2

* Use fmt in header-only-mode

* Remove fmt from CMakeLists.txt

* Upgrade fast_double_parser to v0.2.0

* Revert "Add PRIVATE argument to target_link_libraries"

This reverts commit 3dd45dde7b92531b2530ab54522bb843c56227a7.

* Address James Lamb's comments

* Update R-package/.Rbuildignore
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Upgrade to fast_double_parser v0.3.0 - Solaris support

* Use legacy code only in Solaris

* Fix lint issues

* Fix comment

* Address StrikerRUS's comments (solaris ifdef).

* Change header guards
Co-authored-by: James Lamb <jaylamb20@gmail.com>

792c9303

24 Nov, 2020 1 commit
- fix regex in MANIFEST file (#3593) · b3607004
  Nikita Titov authored Nov 24, 2020
  
  b3607004
19 Nov, 2020 1 commit
- [python] cut unnecessary files to reduce package size (#3579) · 637ff800
  James Lamb authored Nov 19, 2020
  
  637ff800
15 Jul, 2018 1 commit

[ci][python] improved paths in setup.py and small CI refactoring (#1513) · c6cdea75

Nikita Titov authored Jul 15, 2018

* fixed paths in python-package installation

* less cd commands at CI

* hotfix

* added copying missed file from windows directory

* not copy filters file

* refined paths in nuget creation script

* removed filters file from MANIFEST.in

c6cdea75

10 Sep, 2017 1 commit

[python] [setup] removing source files (#898) · 9eab7ec8

Nikita Titov authored Sep 10, 2017

* travis cleanup

* removed precompiled files in windows folder from sdist command

* removed rubbish from install folder

* added compute folder

9eab7ec8

08 Sep, 2017 1 commit

[python] [setup] improving installation (#880) · 8984111f

Nikita Titov authored Sep 08, 2017

* disabled logs from compilers; fixed #874

* fixed safe clear_fplder

* added windows folder to manifest.in

* added windows folder to build

* added library path

* added compilation with MSBuild from .sln-file

* fixed unknown PlatformToolset returns exitcode 0

* hotfix

* updated Readme

* removed return

* added installation with mingw test to appveyor

* let's test appveyor with both VS 2015 and VS 2017; but MinGW isn't installed on VS 2017 image

* fixed built-in name 'file'

* simplified appveyor

* removed excess data_files

* fixed unreadable paths

* separated exceptions for cmake and mingw

* refactored silent_call

* don't create artifacts with VS 2015 and mingw

* be more precise with python versioning in Travis

* removed unnecessary if statement

* added classifiers for PyPI and python versions badge

* changed python version in travis

* added support of scikit-learn 0.18.x

* added more python versions to Travis

* added more python versions to Appveyor

* reduced number of tests in Travis

* Travis trick is not needed anymore

* attempt to fix according to https://github.com/Microsoft/LightGBM/pull/880#discussion_r137438856

8984111f

13 Jul, 2017 1 commit
- Include LICENSE during python sdist (#680) · adf9bdec
  Joshua Adelman authored Jul 12, 2017
  
  adf9bdec
20 Jun, 2017 1 commit

[python] Submit to PyPI (#635) · 80c641cd

Guolin Ke authored Jun 20, 2017

* add make command to the python package.

* Update README.rst

* Update README.rst

* Update README.rst

* fix tests.

* fix unix build

* update readme

* fix setup.py

* update travis

* Update .travis.yml

* Update test.py

* some fixes.

* check the 64-bit python

* fix build.

* refine MANIFEST.in

* update Manifest.in

* add more build options.

* Add fatal in cmake

* fix a endif.

* fix bugs.

* fix pep8

* add test for the pip package build

* add test pip install in travis.

* fix version with pre-compile dll

* fix readme.rst

* update readme

80c641cd