- 23 Mar, 2022 1 commit
-
-
shiyu1994 authored
* new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by:
Yu Shi <shiyu1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 28 Jun, 2020 1 commit
-
-
Ilya Matiach authored
* adding sparse support to TreeSHAP in lightgbm * updating based on comments * updated based on comments, used fromiter instead of frombuffer * updated based on comments * fixed limits import order * fix sparse feature contribs to work with more than int32 max rows * really fixed int64 max error and build warnings * added sparse test with >int32 max rows * fixed python side reshape check on sparse data * updated based on latest comments * fixed comments * added CSC INT32_MAX validation to test, fixed comments
-
- 11 Mar, 2020 1 commit
-
-
Nikita Titov authored
* fixed cpplint errors and disable warning only for VS * wrap more pragma warning
-
- 08 Mar, 2020 1 commit
-
-
Guolin Ke authored
* commit * fix msvc * fix format
-
- 02 Feb, 2020 1 commit
-
-
Guolin Ke authored
* commit * fix a bug * fix bug * reset to track changes * refine the auto choose logic * sort the time stats output * fix include * change multi_val_bin_sparse_threshold * add cmake * add _mm_malloc and _mm_free for cross platform * fix cmake bug * timer for split * try to fix cmake * fix tests * refactor DataPartition::Split * fix test * typo * formating * Revert "formating" This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222. * add document * [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719) * naming * fix gpu code * Update include/LightGBM/bin.h Co-Authored-By:
James Lamb <jaylamb20@gmail.com> * Update src/treelearner/ocl/histogram16.cl * test: swap compilers for CI * fix omp * not avx2 * no aligned for feature histogram * Revert "refactor DataPartition::Split" This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8. * slightly refactor data partition * reduce the memory cost Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 15 Jan, 2020 1 commit
-
-
Guolin Ke authored
* try to use _mm_prefetch anywhere * refine * fix bug * remove the unneeded prefetch
-
- 13 Jan, 2020 1 commit
-
-
Guolin Ke authored
* add prefetch for dense bin * prefetch for ordered bin * Update meta.h * Update meta.h * Update dense_bin.hpp
-
- 07 Oct, 2019 1 commit
-
-
James Lamb authored
* fixed miscellaneous typos in documentation * fix typo introduced in typo-fixing PR
-
- 30 Apr, 2019 1 commit
-
-
Nikita Titov authored
* Update meta.h * Update json11.hpp
-
- 13 Apr, 2019 1 commit
-
-
Nikita Titov authored
-
- 11 Apr, 2019 1 commit
-
-
Nikita Titov authored
* added all necessary includes - fixed build/include_what_you_use error * fixed the order of includes (build/include_order)
-
- 17 Dec, 2017 1 commit
-
-
Guolin Ke authored
-
- 15 Dec, 2017 2 commits
- 12 Dec, 2017 1 commit
-
-
Guolin Ke authored
-
- 29 Nov, 2017 1 commit
-
-
ww authored
-
- 18 Aug, 2017 1 commit
-
-
i3v authored
-
- 30 Jul, 2017 1 commit
-
-
Guolin Ke authored
* finish the data loading part * allow prediction. * fix bug for decision type. * finish split finding part * fix bugs. * bug fixed. add a test . * fix pep8 . * update documents. * fix test bugs. * fix a format * fix import error in python test. * disable missing handle in categorial features. * fix a bug. * add more tests. * fix pep8 * fix bugs. * remove the missing handle code for categorical feature.
-
- 15 May, 2017 1 commit
-
-
Guolin Ke authored
-
- 17 Apr, 2017 1 commit
-
- 16 Apr, 2017 1 commit
-
-
Guolin Ke authored
* some refactor. * two stage sum up to reduce sum up error. * add more two-stage sumup. * some refactor. * add alignment. * change name to aligned_allocator. * remove some useless sumup. * fix a warning. * add -march=native . * remove the padding of gradients. * no alignment. * fix test. * change KNumSumupGroup to 32768. * change gcc flags.
-
- 10 Apr, 2017 1 commit
-
-
Guolin Ke authored
* refine prediction logic. * fix test. * fix out_len in training score of Dart. * improve predict speed for high dimension data.
-
- 02 Dec, 2016 1 commit
-
-
wxchan authored
1. merge python-package 2. add dump model to json 3. fix bugs 4. clean code with pylint 5. update python examples
-
- 26 Nov, 2016 1 commit
-
-
Guolin Ke authored
-
- 18 Nov, 2016 1 commit
-
-
Guolin Ke authored
* RAII for utils, application and c_api(partical) * raii for class in include folder * raii for application and boosting * raii for dataset and dataset loader * raii for dense bin and parser * RAII refactor for almost all classes * RAII for c_api * clean code * refine repeated code * Decouple the "sigmoid" between objective and boosting. * change std::vector<bool> back to std::vector<char> due to concurrence problem * slight reduce some memory cost
-
- 07 Nov, 2016 1 commit
-
-
Guolin Ke authored
-
- 31 Oct, 2016 1 commit
-
-
Guolin Ke authored
-
- 08 Aug, 2016 1 commit
-
-
Guolin Ke authored
-
- 05 Aug, 2016 1 commit
-
-
Guolin Ke authored
-