Commits · de39dbcf3d74b5e894dfc297f7eeaf5eb56c9701 · tianlh / LightGBM-DCU

16 Apr, 2017 1 commit

faster histogram sum up (#418) · 98c7c2a3

Guolin Ke authored Apr 16, 2017

* some refactor.

* two stage sum up to reduce sum up error.

* add more two-stage sumup.

* some refactor.

* add alignment.

* change name to aligned_allocator.

* remove some useless sumup.

* fix a warning.

* add -march=native .

* remove the padding of gradients.

* no alignment.

* fix test.

* change KNumSumupGroup to 32768.

* change gcc flags.

98c7c2a3

09 Apr, 2017 1 commit

Initial GPU acceleration support for LightGBM (#368) · 0bb4a825

Huan Zhang authored Apr 09, 2017

* add dummy gpu solver code

* initial GPU code

* fix crash bug

* first working version

* use asynchronous copy

* use a better kernel for root

* parallel read histogram

* sparse features now works, but no acceleration, compute on CPU

* compute sparse feature on CPU simultaneously

* fix big bug; add gpu selection; add kernel selection

* better debugging

* clean up

* add feature scatter

* Add sparse_threshold control

* fix a bug in feature scatter

* clean up debug

* temporarily add OpenCL kernels for k=64,256

* fix up CMakeList and definition USE_GPU

* add OpenCL kernels as string literals

* Add boost.compute as a submodule

* add boost dependency into CMakeList

* fix opencl pragma

* use pinned memory for histogram

* use pinned buffer for gradients and hessians

* better debugging message

* add double precision support on GPU

* fix boost version in CMakeList

* Add a README

* reconstruct GPU initialization code for ResetTrainingData

* move data to GPU in parallel

* fix a bug during feature copy

* update gpu kernels

* update gpu code

* initial port to LightGBM v2

* speedup GPU data loading process

* Add 4-bit bin support to GPU

* re-add sparse_threshold parameter

* remove kMaxNumWorkgroups and allows an unlimited number of features

* add feature mask support for skipping unused features

* enable kernel cache

* use GPU kernels withoug feature masks when all features are used

* REAdme.

* update README

* fix typos (#349)

* change compile to gcc on Apple as default

* clean vscode related file

* refine api of constructing from sampling data.

* fix bug in the last commit.

* more efficient algorithm to sample k from n.

* fix bug in filter bin

* change to boost from average output.

* fix tests.

* only stop training when all classes are finshed in multi-class.

* limit the max tree output. change hessian in multi-class objective.

* robust tree model loading.

* fix test.

* convert the probabilities to raw score in boost_from_average of classification.

* fix the average label for binary classification.

* Add boost_from_average to docs (#354)

* don't use "ConvertToRawScore" for self-defined objective function.

* boost_from_average seems doesn't work well in binary classification. remove it.

* For a better jump link (#355)

* Update Python-API.md

* for a better jump in page

A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/)

After adding the spaces, we can jump to the exact position in page by click the link.

* fixed something mentioned by @wxchan

* Update Python-API.md

* add FitByExistingTree.

* adapt GPU tree learner for FitByExistingTree

* avoid NaN output.

* update boost.compute

* fix typos (#361)

* fix broken links (#359)

* update README

* disable GPU acceleration by default

* fix image url

* cleanup debug macro

* remove old README

* do not save sparse_threshold_ in FeatureGroup

* add details for new GPU settings

* ignore submodule when doing pep8 check

* allocate workspace for at least one thread during builing Feature4

* move sparse_threshold to class Dataset

* remove duplicated code in GPUTreeLearner::Split

* Remove duplicated code in FindBestThresholds and BeforeFindBestSplit

* do not rebuild ordered gradients and hessians for sparse features

* support feature groups in GPUTreeLearner

* Initial parallel learners with GPU support

* add option device, cleanup code

* clean up FindBestThresholds; add some omp parallel

* constant hessian optimization for GPU

* Fix GPUTreeLearner crash when there is zero feature

* use np.testing.assert_almost_equal() to compare lists of floats in tests

* travis for GPU

0bb4a825

05 Apr, 2017 1 commit

improve speed of regression task. (#381) · d4c4d9ae

Guolin Ke authored Apr 05, 2017

* reduce the sumup cost of constant hessians.

* fix test.

* fix bug when have weights.

* fix a comment.

* reduce branching.

d4c4d9ae

25 Mar, 2017 1 commit
- add FitByExistingTree. · 8a6bd5ec
  Guolin Ke authored Mar 25, 2017
  
  8a6bd5ec
22 Jan, 2017 1 commit
- use subset to speed up bagging · c8fbd42b
  Guolin Ke authored Jan 22, 2017
  
  c8fbd42b
10 Jan, 2017 1 commit
- change inner prediction score to double type. · 12a96334
  Guolin Ke authored Jan 10, 2017
  
  12a96334
09 Jan, 2017 1 commit
- use std::string for tree_learner_type. · a6f47d00
  Guolin Ke authored Jan 09, 2017
  
  a6f47d00
18 Dec, 2016 1 commit
- refine reset_parameters logic · c2e94f17
  Guolin Ke authored Dec 18, 2016
  
  c2e94f17
18 Nov, 2016 1 commit

Refactor for RAII (#86) · 5442ed78

Guolin Ke authored Nov 18, 2016

* RAII for utils, application and c_api(partical)

* raii for class in include folder

* raii for application and boosting

* raii for dataset and dataset loader

* raii for dense bin and parser

* RAII refactor for almost all classes

* RAII for c_api

* clean code

* refine repeated code

* Decouple the "sigmoid" between objective and boosting.

* change std::vector<bool> back to std::vector<char> due to concurrence problem

* slight reduce some memory cost

5442ed78

19 Oct, 2016 1 commit
- updating comments for easy read · 39e47323
  Qiwei Ye authored Oct 19, 2016
  
  39e47323
08 Oct, 2016 1 commit
- Update tree_learner.h · 47feda49
  Guolin Ke authored Oct 08, 2016
  
  47feda49
08 Aug, 2016 1 commit
- some warning fix · 68bd9ab9
  Guolin Ke authored Aug 08, 2016
  
  68bd9ab9
05 Aug, 2016 1 commit
- first commit · 1c774687
  Guolin Ke authored Aug 05, 2016
  
  1c774687