Commits · a119639a5bfaa78887a4867da611da977fc072f2 · tianlh / LightGBM-DCU

12 Sep, 2019 1 commit
- update feature_fraction_bynode (#2381) · ad8e8ccc
  Guolin Ke authored Sep 12, 2019
```
* update

* fix a bug

* Update config.h

* Update Parameters.rst
```
  ad8e8ccc
03 Sep, 2019 1 commit

sub-features for node level (#2330) · bbbad73d

Guolin Ke authored Sep 03, 2019

* add parameter

* implement

* fix bug

* fix bug

* fix according comment

* add test

* Update test_engine.py

* Update test_engine.py

* Update test_engine.py

bbbad73d

24 Jul, 2019 1 commit

add weight in tree model output (#2269) · e1d7a7b9

Guolin Ke authored Jul 24, 2019

* add weight in tree model output

* fix bug

* updated Python plotting part to handle weights

e1d7a7b9

06 May, 2019 1 commit
- fix a bug when bagging with reset_config (#2149) · 46d21476
  Guolin Ke authored May 06, 2019
```
* fix a bug when bagging with reset_config

* clean code
```
  46d21476
13 Apr, 2019 2 commits
- fixed cpplint errors about spaces and newlines (#2102) · 0a4a7a86
  Nikita Titov authored Apr 13, 2019
  
  0a4a7a86
- added copyright message in files (#2101) · 32ef7603
  Nikita Titov authored Apr 13, 2019
  
  32ef7603
11 Apr, 2019 1 commit

reworked includes in source files (#2066) · 50ce01b5

Nikita Titov authored Apr 12, 2019

* added all necessary includes - fixed build/include_what_you_use error

* fixed the order of includes (build/include_order)

50ce01b5

04 Apr, 2019 1 commit

Add Cost Effective Gradient Boosting (#2014) · 76102284

remcob-gr authored Apr 04, 2019

* Add configuration parameters for CEGB.

* Add skeleton CEGB tree learner

Like the original CEGB version, this inherits from SerialTreeLearner.
Currently, it changes nothing from the original.

* Track features used in CEGB tree learner.

* Pull CEGB tradeoff and coupled feature penalty from config.

* Implement finding best splits for CEGB

This is heavily based on the serial version, but just adds using the coupled penalties.

* Set proper defaults for cegb parameters.

* Ensure sanity checks don't switch off CEGB.

* Implement per-data-point feature penalties in CEGB.

* Implement split penalty and remove unused parameters.

* Merge changes from CEGB tree learner into serial tree learner

* Represent features_used_in_data by a bitset, to reduce the memory overhead of CEGB, and add sanity checks for the lengths of the penalty vectors.

* Fix bug where CEGB would incorrectly penalise a previously used feature

The tree learner did not update the gains of previously computed leaf splits when splitting a leaf elsewhere in the tree.
This caused it to prefer new features due to incorrectly penalising splitting on previously used features.

* Document CEGB parameters and add them to the appropriate section.

* Remove leftover reference to cegb tree learner.

* Remove outdated diff.

* Fix warnings

* Fix minor issues identified by @StrikerRUS.

* Add docs section on CEGB, including citation.

* Fix link.

* Fix CI failure.

* Add some unit tests

* Fix pylint issues.

* Fix remaining pylint issue

76102284

02 Feb, 2019 1 commit
- cpplint whitespaces and new lines (#1986) · 90127b52
  Nikita Titov authored Feb 02, 2019
  
  90127b52
17 Dec, 2018 1 commit

Fix bugs in RF (#1906) · cba82447

Guolin Ke authored Dec 17, 2018

* fix RF's bugs

* fix tests

* rollback num_iterations

* fix a bug and reduce memory costs

* reduce memory cost

cba82447

01 Nov, 2018 1 commit
- try to fix bug with disable openmp (#1813) · 59f10453
  Guolin Ke authored Nov 01, 2018
  
  59f10453
26 Oct, 2018 1 commit
- fix problems with null json node. (#1785) · b64751bf
  Guolin Ke authored Oct 26, 2018
  
  b64751bf
25 Aug, 2018 2 commits

add support of refit-decay (#1603) · 2db6377a

Guolin Ke authored Aug 25, 2018

* add support of refit-decay

* add refit into c_api

* add test

* update document

* Update basic.py

* Update test_engine.py

* Update basic.py

* Update test_engine.py

* fix comments

* update test

* fix the comments

* Update test_engine.py

2db6377a

fix num machines check in distributed case (#1611) · b1bbebaa
Ilya Matiach authored Aug 25, 2018

b1bbebaa

24 Aug, 2018 1 commit
- fix-parallel-quantile (#1605) · dcf9ad2e
  Guolin Ke authored Aug 24, 2018
```
* fix-parallel-quantile

* Update serial_tree_learner.cpp
```
  dcf9ad2e
16 Aug, 2018 1 commit
- fix include (#1586) · 5bee6489
  Guolin Ke authored Aug 16, 2018
```
* fix include

* reduce dependency on header file

* fix build
```
  5bee6489
20 May, 2018 1 commit

Refine config object (#1381) · dc699574

Guolin Ke authored May 20, 2018

* [WIP] refine config

* [wip] ready for the auto code generate

* auto generate config codes

* use with to open file

* fix bug

* fix pylint

* fix bug

* fix pylint

* fix bugs.

* tmp for failed test.

* fix tests.

* added nthreads alias

* added new aliases from new config.h

* fixed duplicated alias

* refactored parameter_generator.py

* added new aliases from config.h and removed remaining old names

* fix bugs & some miss alias

* added aliases

* add more descriptions.

* add comment.

dc699574

11 May, 2018 1 commit
- [python] decode error description (#1362) · 899151fc
  Nikita Titov authored May 11, 2018
```
* decode error description

* added break line char in log massages
```
  899151fc
24 Apr, 2018 1 commit
- add force_split functionality (#1310) · 84fef715
  Jerry Liu authored Apr 24, 2018
  
  84fef715
18 Apr, 2018 2 commits
- max tree output ( max_delta_step) (#1322) · ebf962fc
  Guolin Ke authored Apr 18, 2018
```
* first draft

* refine a branching
```
  ebf962fc
- Monotone Constraint (#1314) · e005cdb0
  Guolin Ke authored Apr 18, 2018
  
  e005cdb0
16 Jan, 2018 1 commit
- Fix objective functions with zero hessian (#1199) · 5392c9ea
  Guolin Ke authored Jan 16, 2018
  
  5392c9ea
19 Dec, 2017 1 commit

support refit model by new data (#1124) · 92f2a570

Guolin Ke authored Dec 19, 2017

* add code for refit tree

* add implementation.

* update documents.

* clean code

* fix a type

92f2a570

05 Dec, 2017 1 commit
- fix a warning · 41c9df69
  Guolin Ke authored Dec 05, 2017
  
  41c9df69
04 Dec, 2017 1 commit
- fix bug in feature fraction (#1099) · 699d4381
  Guolin Ke authored Dec 05, 2017
```
* fix feature fraction

* fix bugs.
```
  699d4381
26 Nov, 2017 1 commit

Speed up saving and loading model (#1083) · 8a5ec366

Guolin Ke authored Nov 26, 2017

* remove protobuf

* add version number

* remove pmml script

* use float for split gain

* fix warnings

* refine the read model logic of gbdt

* fix compile error

* improve decode speed

* fix some bugs

* fix double accuracy problem

* fix bug

* multi-thread save model

* speed up save model to string

* parallel save/load model

* fix some warnings.

* fix warnings.

* fix a bug

* remove debug output

* fix doc

* fix max_bin warning in tests.

* fix max_bin warning

* fix pylint

* clean code for stringToArray

* clean code for TToString

* remove max_bin

* replace "class" with typename

8a5ec366

11 Sep, 2017 1 commit
- less verbosity · 5543979b
  Guolin Ke authored Sep 11, 2017
  
  5543979b
02 Sep, 2017 1 commit
- fix tree model format (support multi-cat threshold) · ae6ff288
  Guolin Ke authored Sep 02, 2017
  
  ae6ff288
20 Aug, 2017 1 commit
- clean code for the split of bins and leaves. · 6c4a9750
  Guolin Ke authored Aug 20, 2017
  
  6c4a9750
30 Jul, 2017 1 commit

Better missing value handle (#747) · 00cb04a2

Guolin Ke authored Jul 30, 2017

* finish the data loading part

* allow prediction.

* fix bug for decision type.

* finish split finding part

* fix bugs.

* bug fixed. add a test .

* fix pep8 .

* update documents.

* fix test bugs.

* fix a format

* fix import error in python test.

* disable missing handle in categorial features.

* fix a bug.

* add more tests.

* fix pep8

* fix bugs.

* remove the missing handle code for categorical feature.

00cb04a2

30 Jun, 2017 1 commit
- clean code for tree learner. · 82e273ba
  Guolin Ke authored Jun 30, 2017
  
  82e273ba
17 Jun, 2017 1 commit
- reduce the cost of bagging. · 6fa51478
  Guolin Ke authored Jun 17, 2017
  
  6fa51478
07 Jun, 2017 1 commit
- remove a non-need check. · 3089f0bb
  Guolin Ke authored Jun 07, 2017
  
  3089f0bb
29 May, 2017 1 commit
- better reproducible across different compilers. · 7517eefa
  Guolin Ke authored May 29, 2017
  
  7517eefa
15 May, 2017 1 commit
- Handle for missing values (#516) · e984b0d6
  Guolin Ke authored May 15, 2017
  
  e984b0d6
17 Apr, 2017 1 commit

Revert "[WIP]faster histogram sum up" (#422) · 062bfa79

Guolin Ke authored Apr 17, 2017

* Revert "python-package: support valid_names in scikit-learn API (#420)"

This reverts commit de39dbcf.

* Revert "faster histogram sum up (#418)"

This reverts commit 98c7c2a3.

062bfa79

16 Apr, 2017 1 commit

faster histogram sum up (#418) · 98c7c2a3

Guolin Ke authored Apr 16, 2017

* some refactor.

* two stage sum up to reduce sum up error.

* add more two-stage sumup.

* some refactor.

* add alignment.

* change name to aligned_allocator.

* remove some useless sumup.

* fix a warning.

* add -march=native .

* remove the padding of gradients.

* no alignment.

* fix test.

* change KNumSumupGroup to 32768.

* change gcc flags.

98c7c2a3

09 Apr, 2017 1 commit

Initial GPU acceleration support for LightGBM (#368) · 0bb4a825

Huan Zhang authored Apr 09, 2017

* add dummy gpu solver code

* initial GPU code

* fix crash bug

* first working version

* use asynchronous copy

* use a better kernel for root

* parallel read histogram

* sparse features now works, but no acceleration, compute on CPU

* compute sparse feature on CPU simultaneously

* fix big bug; add gpu selection; add kernel selection

* better debugging

* clean up

* add feature scatter

* Add sparse_threshold control

* fix a bug in feature scatter

* clean up debug

* temporarily add OpenCL kernels for k=64,256

* fix up CMakeList and definition USE_GPU

* add OpenCL kernels as string literals

* Add boost.compute as a submodule

* add boost dependency into CMakeList

* fix opencl pragma

* use pinned memory for histogram

* use pinned buffer for gradients and hessians

* better debugging message

* add double precision support on GPU

* fix boost version in CMakeList

* Add a README

* reconstruct GPU initialization code for ResetTrainingData

* move data to GPU in parallel

* fix a bug during feature copy

* update gpu kernels

* update gpu code

* initial port to LightGBM v2

* speedup GPU data loading process

* Add 4-bit bin support to GPU

* re-add sparse_threshold parameter

* remove kMaxNumWorkgroups and allows an unlimited number of features

* add feature mask support for skipping unused features

* enable kernel cache

* use GPU kernels withoug feature masks when all features are used

* REAdme.

* update README

* fix typos (#349)

* change compile to gcc on Apple as default

* clean vscode related file

* refine api of constructing from sampling data.

* fix bug in the last commit.

* more efficient algorithm to sample k from n.

* fix bug in filter bin

* change to boost from average output.

* fix tests.

* only stop training when all classes are finshed in multi-class.

* limit the max tree output. change hessian in multi-class objective.

* robust tree model loading.

* fix test.

* convert the probabilities to raw score in boost_from_average of classification.

* fix the average label for binary classification.

* Add boost_from_average to docs (#354)

* don't use "ConvertToRawScore" for self-defined objective function.

* boost_from_average seems doesn't work well in binary classification. remove it.

* For a better jump link (#355)

* Update Python-API.md

* for a better jump in page

A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/)

After adding the spaces, we can jump to the exact position in page by click the link.

* fixed something mentioned by @wxchan

* Update Python-API.md

* add FitByExistingTree.

* adapt GPU tree learner for FitByExistingTree

* avoid NaN output.

* update boost.compute

* fix typos (#361)

* fix broken links (#359)

* update README

* disable GPU acceleration by default

* fix image url

* cleanup debug macro

* remove old README

* do not save sparse_threshold_ in FeatureGroup

* add details for new GPU settings

* ignore submodule when doing pep8 check

* allocate workspace for at least one thread during builing Feature4

* move sparse_threshold to class Dataset

* remove duplicated code in GPUTreeLearner::Split

* Remove duplicated code in FindBestThresholds and BeforeFindBestSplit

* do not rebuild ordered gradients and hessians for sparse features

* support feature groups in GPUTreeLearner

* Initial parallel learners with GPU support

* add option device, cleanup code

* clean up FindBestThresholds; add some omp parallel

* constant hessian optimization for GPU

* Fix GPUTreeLearner crash when there is zero feature

* use np.testing.assert_almost_equal() to compare lists of floats in tests

* travis for GPU

0bb4a825

07 Apr, 2017 1 commit
- fix some light omp loop . · ddcbe71c
  Guolin Ke authored Apr 07, 2017
  
  ddcbe71c
05 Apr, 2017 1 commit

improve speed of regression task. (#381) · d4c4d9ae

Guolin Ke authored Apr 05, 2017

* reduce the sumup cost of constant hessians.

* fix test.

* fix bug when have weights.

* fix a comment.

* reduce branching.

d4c4d9ae