Commits · 3d4e08e1d8be9763f5d55180e3faf1e11eb20787 · tianlh / LightGBM-DCU

07 Sep, 2022 1 commit

[CUDA] Add feature interaction constraint for cuda_exp (fix #4785) (#5474) · 1444a748

shiyu1994 authored Sep 07, 2022

* add feature interaction constraint for cuda_exp

* test feature interaction constraints for cuda_exp

* remove useless check

* update comment

1444a748

02 Sep, 2022 1 commit

[CUDA] Add Huber regression objective for cuda_exp (#5462) · 45c53f78

shiyu1994 authored Sep 02, 2022

* add huber regression for cuda_exp

* renew tree output on GPU

add test cases for regression objectives

* remove useless changes

* add white space

* fix test_regression

45c53f78

29 Aug, 2022 1 commit

[ci][fix] Fix cuda_exp ci (#5438) · be7f3213

shiyu1994 authored Aug 29, 2022



* fix cuda_exp ci

* fix ci failures introduced by #5279

* cleanup cuda.yml

* fix test.sh

* clean up test.sh

* clean up test.sh

* skip lines by cuda_exp in test_register_logger

* Update tests/python_package_test/test_utilities.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

be7f3213

03 Aug, 2022 1 commit

Fix potential overflow in linear trees (#5395) · e2dfcd69

Nikita Titov authored Aug 03, 2022



* Fix potential overflow in linear trees

* simplify
Co-authored-by: James Lamb <jaylamb20@gmail.com>

e2dfcd69

29 Jul, 2022 2 commits
- Use double precision in threaded calculation of linear tree coefficients (fixes #5226) (#5368) · 44d37184
  Belinda Trotta authored Jul 30, 2022
  
  44d37184
- [CUDA] Initial work for boosting and evaluation with CUDA (#5279) · e0af160a
  shiyu1994 authored Jul 29, 2022
```
* initial work for boosting and evaluation with CUDA

* fix compatibility with CPU code

* fix creating objective without USE_CUDA_EXP

* fix static analysis errors

* fix static analysis errors
```
  e0af160a
08 Jun, 2022 1 commit

Clear split info buffer in cost efficient gradient boosting before every... · f1328d5c

shiyu1994 authored Jun 08, 2022

Clear split info buffer in cost efficient gradient boosting before every iteration (fix partially #3679) (#5164)

* clear split info buffer in cegb_ before every iteration

* check nullable of cegb_ in serial_tree_learner.cpp

* add a test case for checking the split buffer in CEGB

* swith to Threading::For instead of raw OpenMP

* apply review suggestions

* apply review comments

* remove device cpu

f1328d5c

26 Apr, 2022 1 commit
- [CUDA] Fix integer overflow in cuda row-wise data (#5167) · d893cd1f
  shiyu1994 authored Apr 26, 2022
  
  d893cd1f
24 Apr, 2022 1 commit
- fix typo in CEGB method name (#5168) · 3d25e373
  James Lamb authored Apr 23, 2022
  
  3d25e373
30 Mar, 2022 1 commit
- [CUDA] Fix row-wise histogram construction with dense data matrix (#5103) · 417c732c
  shiyu1994 authored Mar 30, 2022
```
* fix cuda exp with dense row wise

* disable usage of multi val group in cuda exp
```
  417c732c
27 Mar, 2022 1 commit

Log warnings for number of bins of categorical features (#4448) · d163c2c1

shiyu1994 authored Mar 28, 2022

* log warnings when number of bins of categorical features exceeds the configured maximum number of bins

* log only one warning information for all categorical features

* Add #include <memory> for unique_ptr

* remove useless param description

d163c2c1

23 Mar, 2022 1 commit

[CUDA] New CUDA version Part 1 (#4630) · 6b56a90c

shiyu1994 authored Mar 23, 2022



* new cuda framework

* add histogram construction kernel

* before removing multi-gpu

* new cuda framework

* tree learner cuda kernels

* single tree framework ready

* single tree training framework

* remove comments

* boosting with cuda

* optimize for best split find

* data split

* move boosting into cuda

* parallel synchronize best split point

* merge split data kernels

* before code refactor

* use tasks instead of features as units for split finding

* refactor cuda best split finder

* fix configuration error with small leaves in data split

* skip histogram construction of too small leaf

* skip split finding of invalid leaves

stop when no leaf to split

* support row wise with CUDA

* copy data for split by column

* copy data from host to CPU by column for data partition

* add synchronize best splits for one leaf from multiple blocks

* partition dense row data

* fix sync best split from task blocks

* add support for sparse row wise for CUDA

* remove useless code

* add l2 regression objective

* sparse multi value bin enabled for CUDA

* fix cuda ranking objective

* support for number of items <= 2048 per query

* speedup histogram construction by interleaving global memory access

* split optimization

* add cuda tree predictor

* remove comma

* refactor objective and score updater

* before use struct

* use structure for split information

* use structure for leaf splits

* return CUDASplitInfo directly after finding best split

* split with CUDATree directly

* use cuda row data in cuda histogram constructor

* clean src/treelearner/cuda

* gather shared cuda device functions

* put shared CUDA functions into header file

* change smaller leaf from <= back to < for consistent result with CPU

* add tree predictor

* remove useless cuda_tree_predictor

* predict on CUDA with pipeline

* add global sort algorithms

* add global argsort for queries with many items in ranking tasks

* remove limitation of maximum number of items per query in ranking

* add cuda metrics

* fix CUDA AUC

* remove debug code

* add regression metrics

* remove useless file

* don't use mask in shuffle reduce

* add more regression objectives

* fix cuda mape loss

add cuda xentropy loss

* use template for different versions of BitonicArgSortDevice

* add multiclass metrics

* add ndcg metric

* fix cross entropy objectives and metrics

* fix cross entropy and ndcg metrics

* add support for customized objective in CUDA

* complete multiclass ova for CUDA

* separate cuda tree learner

* use shuffle based prefix sum

* clean up cuda_algorithms.hpp

* add copy subset on CUDA

* add bagging for CUDA

* clean up code

* copy gradients from host to device

* support bagging without using subset

* add support of bagging with subset for CUDAColumnData

* add support of bagging with subset for dense CUDARowData

* refactor copy sparse subrow

* use copy subset for column subset

* add reset train data and reset config for CUDA tree learner

add deconstructors for cuda tree learner

* add USE_CUDA ifdef to cuda tree learner files

* check that dataset doesn't contain CUDA tree learner

* remove printf debug information

* use full new cuda tree learner only when using single GPU

* disable all CUDA code when using CPU version

* recover main.cpp

* add cpp files for multi value bins

* update LightGBM.vcxproj

* update LightGBM.vcxproj

fix lint errors

* fix lint errors

* fix lint errors

* update Makevars

fix lint errors

* fix the case with 0 feature and 0 bin

fix split finding for invalid leaves

create cuda column data when loaded from bin file

* fix lint errors

hide GetRowWiseData when cuda is not used

* recover default device type to cpu

* fix na_as_missing case

fix cuda feature meta information

* fix UpdateDataIndexToLeafIndexKernel

* create CUDA trees when needed in CUDADataPartition::UpdateTrainScore

* add refit by tree for cuda tree learner

* fix test_refit in test_engine.py

* create set of large bin partitions in CUDARowData

* add histogram construction for columns with a large number of bins

* add find best split for categorical features on CUDA

* add bitvectors for categorical split

* cuda data partition split for categorical features

* fix split tree with categorical feature

* fix categorical feature splits

* refactor cuda_data_partition.cu with multi-level templates

* refactor CUDABestSplitFinder by grouping task information into struct

* pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder

* fix misuse of reference

* remove useless changes

* add support for path smoothing

* virtual destructor for LightGBM::Tree

* fix overlapped cat threshold in best split infos

* reset histogram pointers in data partition and spllit finder in ResetConfig

* comment useless parameter

* fix reverse case when na is missing and default bin is zero

* fix mfb_is_na and mfb_is_zero and is_single_feature_column

* remove debug log

* fix cat_l2 when one-hot

fix gradient copy when data subset is used

* switch shared histogram size according to CUDA version

* gpu_use_dp=true when cuda test

* revert modification in config.h

* fix setting of gpu_use_dp=true in .ci/test.sh

* fix linter errors

* fix linter error

remove useless change

* recover main.cpp

* separate cuda_exp and cuda

* fix ci bash scripts

add description for cuda_exp

* add USE_CUDA_EXP flag

* switch off USE_CUDA_EXP

* revert changes in python-packages

* more careful separation for USE_CUDA_EXP

* fix CUDARowData::DivideCUDAFeatureGroups

fix set fields for cuda metadata

* revert config.h

* fix test settings for cuda experimental version

* skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version

* fix lint issue by adding a blank line

* fix lint errors by resorting imports

* fix lint errors by resorting imports

* fix lint errors by resorting imports

* merge cuda.yml and cuda_exp.yml

* update python version in cuda.yml

* remove cuda_exp.yml

* remove unrelated changes

* fix compilation warnings

fix cuda exp ci task name

* recover task

* use multi-level template in histogram construction

check split only in debug mode

* ignore NVCC related lines in parameter_generator.py

* update job name for CUDA tests

* apply review suggestions

* Update .github/workflows/cuda.yml
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update .github/workflows/cuda.yml
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update header

* remove useless TODOs

* remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062

* #include <LightGBM/utils/log.h> for USE_CUDA_EXP only

* fix include order

* fix include order

* remove extra space

* address review comments

* add warning when cuda_exp is used together with deterministic

* add comment about gpu_use_dp in .ci/test.sh

* revert changing order of included headers
Co-authored-by: Yu Shi <shiyu1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

6b56a90c

20 Feb, 2022 1 commit

CUDATreeLearner: free GPU memory in destructor if any allocated (#4963) · 0db573c3

Dzianis Dus authored Feb 20, 2022

* CUDATreeLearner: free GPU memory in destruuctor if any allocated

* Minor changes: checking for num_gpu_feature_groups is not needed

* Trigger CI again

0db573c3

08 Jan, 2022 1 commit
- fix gpu allocate memory overflow (#4928) · 305369dd
  文佳鹏 authored Jan 08, 2022
  
  305369dd
10 Nov, 2021 1 commit

Always respect forced splits, even when feature_fraction < 1.0 (fixes #4601) (#4725) · 33a2f9ec

tongwu-msft authored Nov 10, 2021

* issue fix #4601

* fix issue 4601 it2

* add tests for issue 4601

* fix warning

* fix warning

* add new line at end

* remove last line at end

* fix lint warning

* address comments

* address comments

* address comments

* fix address

* address comments

* revert seed

* fix recursive force split issue

* fix build error

* fix lint warning

33a2f9ec

23 Sep, 2021 1 commit

simplify and speed up comparisons for splits with identical gains (#4542) · b52ecb16

James Lamb authored Sep 22, 2021

* fix incorrect behavior of SplitInfo == operator for splits with identical gains

* LightSplitInfo too, and improve comment

* dont check features unnecessarily

* update LightSplitInfo too

b52ecb16

28 Jun, 2021 1 commit
- [CUDA] fix CUDA memory error by reducing block number (fixed #4315) (#4327) · 77d9529d
  Robin Dong authored Jun 28, 2021
  
  77d9529d
26 May, 2021 1 commit
- fix GatherInfoForThresholdNumerical boundary (fix #4286) (#4322) · 346f8839
  shiyu1994 authored May 26, 2021
  
  346f8839
10 May, 2021 1 commit
- [docs] remove extra spaces in comments and docs (#4269) · a8ee487a
  James Lamb authored May 10, 2021
  
  a8ee487a
04 May, 2021 2 commits

fix param name (#4253) · fcd24535
Nikita Titov authored May 05, 2021
```
* fix param name

* Update gpu_tree_learner.h

* Update gbdt.h
```
fcd24535

Correct spelling (#4250) · e79716e0

Andrew Ziem authored May 04, 2021



* Correct spelling

Most changes were in comments, and there were a few changes to literals for log output.

There were no changes to variable names, function names, IDs, or functionality.

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Correct spelling

Most are code comments, but one case is a literal in a logging message.

There are a few grammar fixes too.
Co-authored-by: James Lamb <jaylamb20@gmail.com>

e79716e0

22 Apr, 2021 1 commit
- when a leaf has no local data, its histogram shuold be cleared (#4185) · 0a847efe
  shiyu1994 authored Apr 22, 2021
  
  0a847efe
11 Apr, 2021 1 commit

enforce interaction constraints with monotone_constraints_method = intermediate/advanced (#4043) · 9e1d7fa1

Christoph Aymanns authored Apr 11, 2021



* add test for interaction constraints and monotone constraints

* enforce interaction constraints in RecomputeBestSplitForLeaf

* code formatting

* code formatting

* move interaction constraint test to test_engine

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

9e1d7fa1

05 Apr, 2021 1 commit
- clarify DEBUG-level log about tree depth (#4126) · 6d825cd3
  James Lamb authored Apr 05, 2021
```
* clarify DEBUG-level log about tree depth

* more places
```
  6d825cd3
09 Feb, 2021 1 commit
- fix compilation warnings in CUDA treelearner (#3889) · 846b512d
  Nikita Titov authored Feb 09, 2021
```
* remove unused private field

* mask Train as override

* remove unused private field
```
  846b512d
06 Feb, 2021 1 commit
- fix typos in log messages (#3914) · e31244cf
  James Lamb authored Feb 06, 2021
  
  e31244cf
28 Jan, 2021 1 commit

[ci] ignore CUDA-related strings in Python logger test (#3874) · 040b1c54

Nikita Titov authored Jan 28, 2021

* Update test_utilities.py

* Update cuda.yml

* Update test_utilities.py

* Update cuda_tree_learner.cpp

* Update cuda.yml

040b1c54

23 Jan, 2021 1 commit
- Don't copy more than has been allocated to device_features. (#3752) · d951be99
  Chip Kerchner authored Jan 23, 2021
  
  d951be99
18 Jan, 2021 1 commit

[R-package] enable use of trees with linear models at leaves (fixes #3319) (#3699) · ed651e86

James Lamb authored Jan 18, 2021

* [R-package] enable use of trees with linear models at leaves (fixes #3319)

* remove problematic pragmas

* fix tests

* try to fix build scripts

* try fixing pragma check

* more pragma checks

* ok fix pragma stuff for real

* empty commit

* regenerate documentation

* try skipping test

* uncomment CI

* add note on missing value types for R

* add tests on saving and re-loading booster

ed651e86

15 Jan, 2021 1 commit
- Update CUDA treelearner according to changes introduced for linear trees (#3750) · a15a3704
  Nikita Titov authored Jan 15, 2021
```
* Update cuda_tree_learner.cpp

* Update cuda_tree_learner.h

* Update cuda.yml
```
  a15a3704
11 Jan, 2021 1 commit
- Initialize any_nan_ property of LinearTreeLearner (#3709) · 1abc2e06
  Belinda Trotta authored Jan 11, 2021
  
  1abc2e06
07 Jan, 2021 1 commit
- Fix compiler warnings caused by implicit type conversion (fixes #3677) (#3729) · 753b0e9c
  Belinda Trotta authored Jan 07, 2021
```
* Fix compiler warnings caused by implicit type conversion

* Fix more warnings

* Fix more warnings
```
  753b0e9c
05 Jan, 2021 1 commit
- fix test_monotone_constraints often fails on MPI builds (#3683) · 415c0cb5
  CharlesAuguste authored Jan 05, 2021
```
* Fix monotone constraint bug where split does not fulfill constraints.

* Fix indent.
```
  415c0cb5
29 Dec, 2020 1 commit

[python-package] remove unused Eigen files, compile with EIGEN_MPL2_ONLY (fixes #3684) (#3685) · 6cb968af

James Lamb authored Dec 29, 2020



* [python-package] remove unused Eigen files (fixes #3684)

* more changes

* add EIGEN_MPL2_ONLY in VS solution file

* fix VS project

* remove EIGEN_MPL2_ONLY define in linear_tree_learner
Co-authored-by: Nikita Titov <nekit94-12@hotmail.com>

6cb968af

28 Dec, 2020 1 commit

small code and docs refactoring (#3681) · 5a460846

Nikita Titov authored Dec 29, 2020

* small code and docs refactoring

* Update CMakeLists.txt

* Update .vsts-ci.yml

* Update test.sh

* continue

* continue

* revert stable sort for all-unique values

5a460846

24 Dec, 2020 1 commit

Trees with linear models at leaves (#3299) · fcfd4132

Belinda Trotta authored Dec 24, 2020

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs

fcfd4132

13 Nov, 2020 1 commit

Optimization of row-wise histogram construction (#3522) · 0655d67c

shiyu1994 authored Nov 13, 2020



* store without offset in multi_val_dense_bin

* fix offset bug

* add comment for offset

* add comment for bin type selection

* faster operations for offset

* keep most freq bin in histogram for multi val dense

* use original feature iterators

* consider 9 cases (3 x 3) for multi val bin construction

* fix dense bin setting

* fix bin data in multi val group

* fix offset of the first feature histogram

* use float hist buf

* avx in histogram construction

* use avx for hist construction without prefetch

* vectorize bin extraction

* use only 128 vec

* use avx2

* use vectorization for sparse row wise

* add bit size for multi val dense bin

* float with no vectorization

* change multithreading strategy to dynamic

* remove intrinsic header

* fix dense multi val col copy

* remove bit size

* use large enough block size when the bin number is large

* calc min block size by sparsity

* rescale gradients

* rollback gradients scaling

* single precision histogram buffer as an option

* add float hist buffer with thread buffer

* fix setting zero in hist data

* fix hist begin pointer in tree learners

* remove debug logs

* remove omp simd

* update Makevars of R-package

* fix feature group binary storing

* two row wise for double hist buffer

* add subfeature for two row wise

* remove useless code and fix two row wise

* refactor code

* grouping the dense feature groups can get sparse multi val bin

* clean format problems

* one thread for two blocks in sep row wise

* use ordered gradients for sep row wise

* fix grad ptr

* ordered grad with combined block for sep row wise

* fix block threading

* use the same min block size

* rollback share min block size

* remove logs

* Update src/io/dataset.cpp
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* fix parameter description

* remove sep_row_wise

* remove check codes

* add check for empty multi val bin

* fix lint error

* rollback changes in config.h

* Apply suggestions from code review
Co-authored-by: Ubuntu <shiyu@gbdt-04.ren3kv4wanvufliwrpy4k03lsf.xx.internal.cloudapp.net>
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

0655d67c

07 Nov, 2020 1 commit
- fix invalid read detected by valgrind (#3526) · da6c6ea3
  Guolin Ke authored Nov 07, 2020
```
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>
```
  da6c6ea3
01 Nov, 2020 1 commit

Support deterministic (#3494) · c39afb9d

Guolin Ke authored Nov 01, 2020



* implement

* fix compilation

* Update config.cpp

* unify wordings
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>

c39afb9d

27 Oct, 2020 1 commit
- rollback to omp sum (#3493) · 831c0e3f
  Guolin Ke authored Oct 27, 2020
```
* rollback omp sum

* remove sum reduction
```
  831c0e3f