Commits · de8c610512ff23e8b18a953218b09b1901e7085a · tianlh / LightGBM-DCU

21 Jan, 2021 1 commit

Fix thread-safety in C API's PredictSingleRow (#3771) · 4ae4abbe

Alberto Ferreira authored Jan 21, 2021

By using a unique lock instead of the shared lock the timings are very similar,
but predictions are correct.

Even so, by designing a small C++ benchmark with a very simple LGBM model,more threads on a simple model are slower than
the single-thread case. This is probably due to very small work units,
the lock contention overhead increases.

We should in the future benchmark with more complex models to see if supporting
threading on these calls is worth it in performance gains.

If not, then we could choose to not to provide thread-safety and remove
the locks altogether for maximal throughput.

See https://github.com/microsoft/LightGBM/issues/3751 for timings.

See gist for benchmark code:

https://gist.github.com/AlbertoEAF/5972db15a27c294bab65b97e1bc4c315

4ae4abbe

18 Jan, 2021 1 commit

Minor C API cleanup in predictor & SingleRowPredictor (#3777) · 35612633

Alberto Ferreira authored Jan 18, 2021



* Cleanup predictor

* Cleanup SingleRowPredictor

* Update src/application/predictor.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/application/predictor.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/application/predictor.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

35612633

03 Jan, 2021 1 commit

[R-package] allow access to params in Booster (#3662) · 532fa914

James Lamb authored Jan 03, 2021

* [R-package] allow access to params in Booster

* remove unnecessary whitespace

* fix test on resetting params

* remove pytest_cache

* Update R-package/tests/testthat/test_custom_objective.R

532fa914

28 Dec, 2020 1 commit

small code and docs refactoring (#3681) · 5a460846

Nikita Titov authored Dec 29, 2020

* small code and docs refactoring

* Update CMakeLists.txt

* Update .vsts-ci.yml

* Update test.sh

* continue

* continue

* revert stable sort for all-unique values

5a460846

24 Dec, 2020 1 commit

Trees with linear models at leaves (#3299) · fcfd4132

Belinda Trotta authored Dec 24, 2020

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs

fcfd4132

24 Nov, 2020 1 commit
- [refactor] Reduce code duplication in c_api.cpp (#3539) · 5e24b80b
  Alberto Ferreira authored Nov 24, 2020
```
* Refactor c_api.cpp with template code

* Further cleanup

* Fix whitespace for linter
```
  5e24b80b
21 Sep, 2020 1 commit
- fix sparse multiclass local feature contributions and add test (#3382) · eff287e9
  Ilya Matiach authored Sep 21, 2020
  
  eff287e9
20 Sep, 2020 1 commit

[GPU] Add support for CUDA-based GPU build (#3160) · f7ad9457

Chip Kerchner authored Sep 20, 2020

* Initial CUDA work

* redirect log to python console (#3090)

* redir log to python console

* fix pylint

* Apply suggestions from code review

* Update basic.py

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update c_api.h

* Apply suggestions from code review

* super-minor: better wording
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>

* re-order includes (fixes #3132) (#3133)

* Revert "re-order includes (fixes #3132) (#3133)" (#3153)

This reverts commit 656d2676

* Missing change from previous rebase

* Minor cleanup and removal of development scripts.

* Only set gpu_use_dp on by default for CUDA. Other minor change.

* Fix python lint indentation problem.

* More python lint issues.

* Big lint cleanup - more to come.

* Another large lint cleanup - more to come.

* Even more lint cleanup.

* Minor cleanup so less differences in code.

* Revert is_use_subset changes

* Another rebase from master to fix recent conflicts.

* More lint.

* Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code.

* Removed parameters added for CUDA and various bug fix.

* Yet more lint and unneccessary changes.

* Revert another change.

* Removal of unneccessary code.

* temporary appveyor.yml for building and testing

* Remove return value in ReSize

* Removal of unused variables.

* Code cleanup from reviewers suggestions.

* Removal of FIXME comments and unused defines.

* More reviewers comments cleanup.

* Fix config variables.

* Attempt to fix check-docs failure

* Update Paramster.rst for num_gpu

* Removing test appveyor.yml

* Add CUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue.

* Fixed handling of data elements less than 2K.

* More reviewers comments cleanup.

* Removal of TODO and fix printing of int64_t

* Add cuda change for CI testing and remove cuda from device_type in python.

* Missed one change form previous check-in

* Removal AdditionConfig and fix settings.

* Limit number of GPUs to one for now in CUDA.

* Update Parameters.rst for previous check-in

* Whitespace removal.

* Cleanup unused code.

* Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work.

* Lint change from previous check-in.

* Changes based on reviewers comments.

* More reviewer comment changes.

* Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_

* Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define.

* Reviewer comment cleanup.

* Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once.

* Remove PRINT debug for CUDA code.

* Allow to use of multiple GPUs for CUDA.

* More multi-GPUs enablement for CUDA.

* More code cleanup based on reviews comments.

* Update docs with latest config changes.
Co-authored-by: Gordon Fossum <fossum@us.ibm.com>
Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com>
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

f7ad9457

06 Aug, 2020 1 commit

[Python] / [R] add start_iteration to python predict interface (fix #3058) (#3272) · 82e2ff7a

shiyu1994 authored Aug 06, 2020



* [python] add start_iteration to python predict interface (#3058)

* Apply suggestions from code review

* Update lightgbm_R.h

* Apply suggestions from code review

* Apply suggestions from code review

* fix R interface

* update R documentation
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

82e2ff7a

05 Aug, 2020 1 commit

Fast single row predict API v2 (#3268) · b5027de3

Alberto Ferreira authored Aug 05, 2020

* Fix bug introduced in PR #2992 for Fast predict

* Faster Fast predict API

* Add const to SingleRow Fast methods

b5027de3

29 Jul, 2020 1 commit

[TYPO] DatasetLoader::ConstructFromSampleData (#3258) · 6f339d77

Lucas David authored Jul 29, 2020



* ~ Modified name of method DatasetLoader::CostructFromSampleData to DatasetLoader::ConstructFromSampleData.
& Build passes for Debug, Debug_DLL, DLL and Release (not tested Debug_mpi and Release_mpi).

* ~ Refactored indentations.
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

6f339d77

19 Jul, 2020 1 commit

Change locking strategy of Booster, allow for share and unique locks (#2760) · 1c35c3b9

Joan Fontanals authored Jul 19, 2020



* Add capability to get possible max and min values for a model

* Change implementation to have return value in tree.cpp, change naming to upper and lower bound, move implementation to gdbt.cpp

* Update include/LightGBM/c_api.h
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Change iteration to avoid potential overflow, add bindings to R and Python and a basic test

* Adjust test values

* Consider const correctness and multithreading protection

* Put everything possible as const

* Include shared_mutex, for now as unique_lock

* Update test values

* Put everything possible as const

* Include shared_mutex, for now as unique_lock

* Make PredictSingleRow const and share the lock with other reading threads

* Update test values

* Add test to check that model is exactly the same in all platforms

* Try to parse the model to get the expected values

* Try to parse the model to get the expected values

* Fix implementation, num_leaves can be lower than the leaf_value_ size

* Do not check for num_leaves to be smaller than actual size and get back to test with hardcoded value

* Change test order

* Add gpu_use_dp option in test

* Remove helper test method

* Remove TODO

* Add preprocessing option to compile with c++17

* Update python-package/setup.py
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Remove unwanted changes

* Move option

* Fix problems introduced by conflict fix

* Avoid switching to c++17 and use yamc mutex library to access shared lock functionality

* Add extra yamc include

* Change header order

* some lint fix

* change include order and remove some extra blank lines

* Further fix lint issues

* Update c_api.cpp

* Further fix lint issues

* Move yamc include files to a new yamc folder

* Use standard unique_lock

* Update windows/LightGBM.vcxproj
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix problems coming from merge conflict resolution
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: joanfontanals <jfontanals@ntent.com>
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

1c35c3b9

16 Jul, 2020 1 commit

Upcast index to size_t in Refit (closes #3227) (#3228) · f5f27ca8

Jan Tilly authored Jul 16, 2020

In the current implementation, the index is an int32, which will segfault with large data sets and a large number of estimators.

f5f27ca8

15 Jul, 2020 2 commits

Feat/optimize single prediction (#2992) · fc79b366

Alberto Ferreira authored Jul 15, 2020

* [performance] Add Fast methods to C API for SingleRow Predictions

 * Add methods to C API to make single-row predictions faster:

   - LGBM_BoosterPredictForMatSingleRowFastInit (setup)
   - LGBM_BoosterPredictForMatSingleRowFast (predict)
   - LGBM_FastConfigFree (cleanup setup outputs)

* Code syle cleanup

* Fix lint errors

* [performance] Revert FastConfig improvement to pass data at init

This reduces optimization by 5% / 30% with this branch but makes it so it can be used for higher level wrappers in MMLSpark.
And outside it as well.

* [performance] Introduce Fast variants for SingleRow predictors.

Although this already provides performance gains by itself for any
callers, two new functions were added to Java's SWIG interfaces to
exploit that AND the GetPrimitiveArrayCritical data fetches.

* [tests/profiling] Profile Fast predict methods

Build with -DBUILD_PROFILING_TESTS=ON and copy the default
model trained on the Higgs dataset from the benchmarks repo

 https://github.com/guolinke/boosting_tree_benchmarks.git



to LightGBM repo root and run the lightgbm_profile_* binaries.

The single instance used is the first row from that dataset.

* Update comment on CMakeLists.

* Fix doxygen-introduced issue (#threads)

* Fix conflicts due to new RowFunctionFromCSR signature in master

* Change FastConfig ncol to int32_t.

* Removed profiling folder

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Doxygen: change new docstrings to double back-quote
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

fc79b366

feature importance type in saved model file (#3220) · 87d46489

Guolin Ke authored Jul 16, 2020



* feature importance type in saved model file

* fix nullptr

* fixed formatting

* fix python/R

* Update src/c_api.cpp

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* fix c_api test

* fix swig

* minor docs improvements and added defines for importance types
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

87d46489

28 Jun, 2020 1 commit

adding sparse support to TreeSHAP in lightgbm (#3000) · 9f367d11

Ilya Matiach authored Jun 28, 2020

* adding sparse support to TreeSHAP in lightgbm

* updating based on comments

* updated based on comments, used fromiter instead of frombuffer

* updated based on comments

* fixed limits import order

* fix sparse feature contribs to work with more than int32 max rows

* really fixed int64 max error and build warnings

* added sparse test with >int32 max rows

* fixed python side reshape check on sparse data

* updated based on latest comments

* fixed comments

* added CSC INT32_MAX validation to test, fixed comments

9f367d11

11 Jun, 2020 1 commit
- refactor LGBM_DatasetGetFeatureNames (#3022) · f30e0bb3
  Nikita Titov authored Jun 11, 2020
  
  f30e0bb3
05 Jun, 2020 1 commit
- Revert "re-order includes (fixes #3132) (#3133)" (#3153) · ac5f5e56
  Nikita Titov authored Jun 05, 2020
```
This reverts commit 656d2676.
```
  ac5f5e56
01 Jun, 2020 1 commit
- re-order includes (fixes #3132) (#3133) · 656d2676
  James Lamb authored Jun 01, 2020
  
  656d2676
20 May, 2020 1 commit

redirect log to python console (#3090) · dea2391b

Guolin Ke authored May 21, 2020



* redir log to python console

* fix pylint

* Apply suggestions from code review

* Update basic.py

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update c_api.h

* Apply suggestions from code review

* Apply suggestions from code review

* super-minor: better wording
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>

dea2391b

12 Apr, 2020 1 commit
- Fix user locale settings (#2980) · 8c335352
  OMOTO Tsukasa authored Apr 12, 2020
```
This reverts commit 6b68967d (#2891) and fixes #2979.
```
  8c335352
21 Mar, 2020 1 commit
- [ci] fixed cpplint errors about namespace using-directives (#2927) · 3e540eac
  James Lamb authored Mar 21, 2020
  
  3e540eac
20 Mar, 2020 1 commit

Fix SWIG methods that return char** (#2850) · 91185c3a

Alberto Ferreira authored Mar 20, 2020



* [swig] Fix SWIG methods that return char** with StringArray.

+ [new] Add StringArray class to manage and manipulate arrays of fixed-length strings:

  This class is now used to wrap any char** parameters, manage memory and
  manipulate the strings.

  Such class is defined at swig/StringArray.hpp and wrapped in StringArray.i.

+ [API+fix] Wrap LGBM_BoosterGetFeatureNames it resulted in segfault before:

  Added wrapper LGBM_BoosterGetFeatureNamesSWIG(BoosterHandle) that
  only receives the booster handle and figures how much memory to allocate
  for strings and returns a StringArray which can be easily converted to String[].

+ [API+safety] For consistency, LGBM_BoosterGetEvalNamesSWIG was wrapped as well:

  * Refactor to detect any kind of errors and removed all the parameters
    besides the BoosterHandle (much simpler API to use in Java).
  * No assumptions are made about the required string space necessary (128 before).
  * The amount of required string memory is computed internally

+ [safety] No possibility of undefined behaviour

  The two methods wrapped above now compute the necessary string storage space
  prior to allocation, as the low-level C API calls would crash the process
  irreversibly if they write more memory than which is passed to them.

* Changes to C API and wrappers support char**

To support the latest SWIG changes that enable proper char**
return support that is safe, the C API was changed.

The respecive wrappers in R and Python were changed too.

* Cleanup indentation in new lightgbm_R.cpp code

* Adress review code-style comments.

* Update swig/StringArray.hpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/basic.py
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update src/lightgbm_R.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

91185c3a

17 Mar, 2020 1 commit

Fix Booster read/write locale dependency (#2891) · 6b68967d

Alberto Ferreira authored Mar 17, 2020



* Fix Booster read/write locale dependency

* Address review comments

* Move LocaleContext.h->locale_context.h
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>

6b68967d

11 Mar, 2020 1 commit
- fixed cpplint errors and disable warning only for VS (#2888) · bd10918e
  Nikita Titov authored Mar 11, 2020
```
* fixed cpplint errors and disable warning only for VS

* wrap more pragma warning
```
  bd10918e
05 Mar, 2020 1 commit

speed up `FindBestThresholdFromHistogram` (#2867) · 77d92b7c

Guolin Ke authored Mar 05, 2020

* speed up for const hessian

* rename template

* some refactorings

* refine

* refine

* simplify codes

* fix random in feature histogram

* code refine

* refine

* try fix

* make gcc happy

* remove timer

* rollback some changes

* more templates

* fix a bug

* reduce the cost of timer

* fix gpu

* fix bug

* fix gpu

77d92b7c

04 Mar, 2020 2 commits
- fix warnings in debug mode about that not all control paths return a value (#2866) · b70636bc
  Nikita Titov authored Mar 04, 2020
  
  b70636bc
- fixed cpplint issues (#2863) · d018d30a
  Nikita Titov authored Mar 04, 2020
```
* fixed cpplint errors

* fixed more cpplint errors
```
  d018d30a
02 Mar, 2020 3 commits

speed up multi-val bin subset for bagging (#2827) · d0bec9e9

Guolin Ke authored Mar 02, 2020

* speed up multi-val bin subset for bagging

* remove the duplicated codes

* code refine

* some codes refactoring

* move `is_constant_hessian` into `TrainingShareStates`

* refine

* fix bug

* fix bug when num_groups_ < 0

* fix gpu

* fix gpu bagging

* fix gpu bug

* typo

* Update src/treelearner/serial_tree_learner.h

d0bec9e9

don't save num_thread as possible (#2839) · 0aa7bfee

Guolin Ke authored Mar 02, 2020



* don't cache `num_thread`, to avoid change outside

* rename

* update document

* Update docs/Parameters.rst

* Update include/LightGBM/config.h

* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

0aa7bfee

introduced specific CHECKs (#2849) · 5a80b788
Nikita Titov authored Mar 02, 2020

5a80b788

25 Feb, 2020 1 commit
- replace std::runtime_error with Log::Fatal (#2816) · 37ce3eb2
  Nikita Titov authored Feb 25, 2020
```
* replace std::runtime_error with Log::Fatal

* Update c_api.cpp
```
  37ce3eb2
24 Feb, 2020 2 commits
- fixed cpplint issues (#2809) · 224b8b98
  Nikita Titov authored Feb 24, 2020
```
* fixed cpplint issues

* fixed cpplint issues
```
  224b8b98
- Fix SingleRowPredictor::IsPredictorEqual comparison (invert) (#2799) · 60710c70
  Alberto Ferreira authored Feb 24, 2020
  
  60710c70
22 Feb, 2020 1 commit

some code refactoring (#2769) · 3e80df7e

Guolin Ke authored Feb 22, 2020

* some refines

* more omp refactoring

* format define

* fix merge bug

* some fixes

* fix some warnings

* Apply suggestions from code review

* Apply suggestions from code review

* remove dup codes

3e80df7e

20 Feb, 2020 2 commits

remove init-score parameter (#2776) · 3c394c8d

Guolin Ke authored Feb 20, 2020



* remove related cpp codes

* removed more mentiones of init_score_filename params
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

3c394c8d

Add capability to get possible max and min values for a model (#2737) · 18e7de4f

Joan Fontanals authored Feb 20, 2020



* Add capability to get possible max and min values for a model

* Change implementation to have return value in tree.cpp, change naming to upper and lower bound, move implementation to gdbt.cpp

* Update include/LightGBM/c_api.h
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Change iteration to avoid potential overflow, add bindings to R and Python and a basic test

* Adjust test values

* Consider const correctness and multithreading protection

* Update test values

* Update test values

* Add test to check that model is exactly the same in all platforms

* Try to parse the model to get the expected values

* Try to parse the model to get the expected values

* Fix implementation, num_leaves can be lower than the leaf_value_ size

* Do not check for num_leaves to be smaller than actual size and get back to test with hardcoded value

* Change test order

* Add gpu_use_dp option in test

* Remove helper test method

* Update src/c_api.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update src/io/tree.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update src/io/tree.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_basic.py
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Remoove imports
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

18e7de4f

19 Feb, 2020 1 commit

[python] [R-package] refine the parameters for Dataset (#2594) · 9f79e840

Guolin Ke authored Feb 19, 2020



* reset

* fix a bug

* fix test

* Update c_api.h

* support to no filter features by min_data

* add warning in reset config

* refine warnings for override dataset's parameter

* some cleans

* clean code

* clean code

* refine C API function doxygen comments

* refined new param description

* refined doxygen comments for R API function

* removed stuff related to int8

* break long line in warning message

* removed tests which results cannot be validated anymore

* added test for warnings about unchangeable params

* write parameter from dataset to booster

* consider free_raw_data.

* fix params

* fix bug

* implementing R

* fix typo

* filter params in R

* fix R

* not min_data

* refined tests

* fixed linting

* refine

* pilint

* add docstring

* fix docstring

* R lint

* updated description for C API function

* use param aliases in Python

* fixed typo

* fixed typo

* added more params to test

* removed debug print

* fix dataset construct place

* fix merge bug

* Update feature_histogram.hpp

* add is_sparse back

* remove unused parameters

* fix lint

* add data random seed

* update

* [R-package] centrallized Dataset parameter aliases and added tests on Dataset parameter updating (#2767)
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

9f79e840

02 Feb, 2020 1 commit

Support both row-wise and col-wise multi-threading (#2699) · 509c2e50

Guolin Ke authored Feb 02, 2020



* commit

* fix a bug

* fix bug

* reset to track changes

* refine the auto choose logic

* sort the time stats output

* fix include

* change  multi_val_bin_sparse_threshold

* add cmake

* add _mm_malloc and _mm_free for cross platform

* fix cmake bug

* timer for split

* try to fix cmake

* fix tests

* refactor DataPartition::Split

* fix test

* typo

* formating

* Revert "formating"

This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222.

* add document

* [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719)

* naming

* fix gpu code

* Update include/LightGBM/bin.h
Co-Authored-By: James Lamb <jaylamb20@gmail.com>

* Update src/treelearner/ocl/histogram16.cl

* test: swap compilers for CI

* fix omp

* not avx2

* no aligned for feature histogram

* Revert "refactor DataPartition::Split"

This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8.

* slightly refactor data partition

* reduce the memory cost
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

509c2e50

14 Jan, 2020 1 commit

support most frequent bin (#2689) · c7e90393

Guolin Ke authored Jan 14, 2020

* implement

* fix warning

* fix bug

* fix a bug

* remove unneed function

* fix data push bug

* fix valid data push

* fix bug for missing_type=zero

* refine split

* renames

* typo

c7e90393