- 24 Jul, 2025 1 commit
-
-
Jeff Daily authored
* [ROCm] add support for ROCm/HIP - CMakeLists.txt ROCm updates, also replace glob with explicit file list - initial warpSize interop changes - helpers/hipify.sh script added - .gitignore to ignore generated hip source files * more rocm updates - disable compiler warnings - move PercentileDevice __device__ template function into header - bug fixes for __host__ __define__ and __HIP__ preprocessor symbols * more bug fixes * warp 32 vs 64 updates * lint fixes * missing device_index variable * accidental inclusion of hip headers * copyright notice compliance * Update CMakeLists.txt Co-authored-by:
James Lamb <jaylamb20@gmail.com> * fix lint issue * clean up * Update CMakeLists.txt Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update CMakeLists.txt Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * clean up CMakeLists.txt use WARPSIZE * use WARPSIZE * fix share buffer size --------- Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Yu Shi <yushi2@microsoft.com>
-
- 07 Feb, 2025 1 commit
-
-
James Lamb authored
-
- 02 Jan, 2025 1 commit
-
-
shiyu1994 authored
* remove src/treelearner/kernels * Update CMakeLists.txt * clean up
-
- 15 Dec, 2024 1 commit
-
-
Nikita Titov authored
* Update append-comment.sh * Update static_analysis.yml * Update static_analysis.yml * Update basic.py * Update basic.py * Update .pre-commit-config.yaml * Update basic.py * Update basic.py * Update basic.py * Update basic.py * Update basic.py * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update interactive_plot_example.ipynb * Update pyproject.toml * Update append-comment.sh * Update basic.py * Update basic.py * Update pyproject.toml * Update .pre-commit-config.yaml * Update basic.py * Update basic.py * Update test_basic.R * Update rank_objective.hpp * Update histogram_16_64_256.cu * Update static_analysis.yml * ensure alphabetical order of rules
-
- 11 Dec, 2024 1 commit
-
-
Murphy Liang authored
Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com>
-
- 01 Dec, 2024 1 commit
-
-
Oliver Borchert authored
Co-authored-by:Nikita Titov <nekit94-08@mail.ru>
-
- 18 Oct, 2024 1 commit
-
-
dragonbra authored
* basic gpu_linear_tree_learner implementation * corresponding config of gpu linear tree * Update src/io/config.cpp Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * work around for gpu linear tree learner without gpu enabled * add #endif * add #ifdef USE_GPU * fix lint problems * fix compilation when USE_GPU is OFF * add destructor * add gpu_linear_tree_learner.cpp in make file list * use template for linear tree learner --------- Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com>
-
- 13 Oct, 2024 1 commit
-
-
Atanas Dimitrov authored
Co-authored-by:
Atanas Dimitrov <nasko119@abv.bg> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 19 Mar, 2024 1 commit
-
-
James Lamb authored
-
- 23 Feb, 2024 1 commit
-
-
shiyu1994 authored
* support quantized training with categorical features on cpu * remove white spaces * add tests for quantized training with categorical features * skip tests for cuda version * fix cases when only 1 data block in row-wise quantized histogram construction with 8 inner bits * remove useless capture * fix compilation warnings revert useless changes * revert useless change * separate functions in feature histogram into cpp file * add feature_histogram.o in Makevars
-
- 20 Feb, 2024 1 commit
-
-
CVPaul authored
* solve 'bin size 257 cannot run on GPU #3339' https://github.com/microsoft/LightGBM/issues/3339#issuecomment-1665131743 * fix typo LeafIndex -> leaf_index --------- Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 17 Jan, 2024 1 commit
-
-
James Lamb authored
-
- 22 Nov, 2023 1 commit
-
-
James Lamb authored
-
- 10 Oct, 2023 1 commit
-
-
James Lamb authored
-
- 09 Oct, 2023 1 commit
-
-
James Lamb authored
factor out uses of omp_get_num_threads() and omp_get_max_threads() outside of OpenMP wrapper (#6133)
-
- 08 Oct, 2023 1 commit
-
-
shiyu1994 authored
* add quantized training (first stage) * add histogram construction functions for integer gradients * add stochastic rounding * update docs * fix compilation errors by adding template instantiations * update files for compilation * fix compilation of gpu version * initialize gradient discretizer before share states * add a test case for quantized training * add quantized training for data distributed training * Delete origin.pred * Delete ifelse.pred * Delete LightGBM_model.txt * remove useless changes * fix lint error * remove debug loggings * fix mismatch of vector and allocator types * remove changes in main.cpp * fix bugs with uninitialized gradient discretizer * initialize ordered gradients in gradient discretizer * disable quantized training with gpu and cuda fix msvc compilation errors and warnings * fix bug in data parallel tree learner * make quantized training test deterministic * make quantized training in test case more accurate * refactor test_quantized_training * fix leaf splits initialization with quantized training * check distributed quantized training result * add cuda gradient discretizer * add quantized training for CUDA version in tree learner * remove cuda computability 6.1 and 6.2 * fix parts of gpu quantized training errors and warnings * fix build-python.sh to install locally built version * fix memory access bugs * fix lint errors * mark cuda quantized training on cuda with categorical features as unsupported * rename cuda_utils.h to cuda_utils.hu * enable quantized training with cuda * fix cuda quantized training with sparse row data * allow using global memory buffer in histogram construction with cuda quantized training * recover build-python.sh enlarge allowed package size to 100M
-
- 12 Sep, 2023 1 commit
-
-
shiyu1994 authored
* fix leaf splits update after split in quantized training * fix preparation ordered gradients for quantized training * remove force_row_wise in distributed test for quantized training * Update src/treelearner/leaf_splits.hpp --------- Co-authored-by:James Lamb <jaylamb20@gmail.com>
-
- 12 Jul, 2023 1 commit
-
-
shiyu1994 authored
-
- 30 Jun, 2023 1 commit
-
-
maskedcoder1337 authored
-
- 05 May, 2023 1 commit
-
-
shiyu1994 authored
* add quantized training (first stage) * add histogram construction functions for integer gradients * add stochastic rounding * update docs * fix compilation errors by adding template instantiations * update files for compilation * fix compilation of gpu version * initialize gradient discretizer before share states * add a test case for quantized training * add quantized training for data distributed training * Delete origin.pred * Delete ifelse.pred * Delete LightGBM_model.txt * remove useless changes * fix lint error * remove debug loggings * fix mismatch of vector and allocator types * remove changes in main.cpp * fix bugs with uninitialized gradient discretizer * initialize ordered gradients in gradient discretizer * disable quantized training with gpu and cuda fix msvc compilation errors and warnings * fix bug in data parallel tree learner * make quantized training test deterministic * make quantized training in test case more accurate * refactor test_quantized_training * fix leaf splits initialization with quantized training * check distributed quantized training result
-
- 15 Mar, 2023 1 commit
-
-
Aleksandar Bojarov authored
Fix for DEBUG mode This commit fixes issue #5777
-
- 01 Feb, 2023 1 commit
-
-
James Lamb authored
* [ci] speed up if-else, swig, and lint conda setup * add 'source activate' * python constraint * start removing cuda v1 * comment out CI * remove more references * revert some unnecessaary changes * revert a few more mistakes * revert another change that ignored params * sigh * remove CUDATreeLearner * fix tests, docs * fix quoting in setup.py * restore all CI * Apply suggestions from code review Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> * Apply suggestions from code review * completely remove cuda_exp, update docs --------- Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com>
-
- 11 Sep, 2022 1 commit
-
-
Ilya Chernov authored
remove redundant whitespaces
-
- 07 Sep, 2022 1 commit
-
-
shiyu1994 authored
* add feature interaction constraint for cuda_exp * test feature interaction constraints for cuda_exp * remove useless check * update comment
-
- 02 Sep, 2022 1 commit
-
-
shiyu1994 authored
* add huber regression for cuda_exp * renew tree output on GPU add test cases for regression objectives * remove useless changes * add white space * fix test_regression
-
- 29 Aug, 2022 1 commit
-
-
shiyu1994 authored
* fix cuda_exp ci * fix ci failures introduced by #5279 * cleanup cuda.yml * fix test.sh * clean up test.sh * clean up test.sh * skip lines by cuda_exp in test_register_logger * Update tests/python_package_test/test_utilities.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 03 Aug, 2022 1 commit
-
-
Nikita Titov authored
* Fix potential overflow in linear trees * simplify Co-authored-by:James Lamb <jaylamb20@gmail.com>
-
- 29 Jul, 2022 2 commits
-
-
Belinda Trotta authored
-
shiyu1994 authored
* initial work for boosting and evaluation with CUDA * fix compatibility with CPU code * fix creating objective without USE_CUDA_EXP * fix static analysis errors * fix static analysis errors
-
- 08 Jun, 2022 1 commit
-
-
shiyu1994 authored
Clear split info buffer in cost efficient gradient boosting before every iteration (fix partially #3679) (#5164) * clear split info buffer in cegb_ before every iteration * check nullable of cegb_ in serial_tree_learner.cpp * add a test case for checking the split buffer in CEGB * swith to Threading::For instead of raw OpenMP * apply review suggestions * apply review comments * remove device cpu
-
- 26 Apr, 2022 1 commit
-
-
shiyu1994 authored
-
- 24 Apr, 2022 1 commit
-
-
James Lamb authored
-
- 30 Mar, 2022 1 commit
-
-
shiyu1994 authored
* fix cuda exp with dense row wise * disable usage of multi val group in cuda exp
-
- 27 Mar, 2022 1 commit
-
-
shiyu1994 authored
* log warnings when number of bins of categorical features exceeds the configured maximum number of bins * log only one warning information for all categorical features * Add #include <memory> for unique_ptr * remove useless param description
-
- 23 Mar, 2022 1 commit
-
-
shiyu1994 authored
* new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by:
Yu Shi <shiyu1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 20 Feb, 2022 1 commit
-
-
Dzianis Dus authored
* CUDATreeLearner: free GPU memory in destruuctor if any allocated * Minor changes: checking for num_gpu_feature_groups is not needed * Trigger CI again
-
- 08 Jan, 2022 1 commit
-
-
文佳鹏 authored
-
- 10 Nov, 2021 1 commit
-
-
tongwu-msft authored
* issue fix #4601 * fix issue 4601 it2 * add tests for issue 4601 * fix warning * fix warning * add new line at end * remove last line at end * fix lint warning * address comments * address comments * address comments * fix address * address comments * revert seed * fix recursive force split issue * fix build error * fix lint warning
-
- 23 Sep, 2021 1 commit
-
-
James Lamb authored
* fix incorrect behavior of SplitInfo == operator for splits with identical gains * LightSplitInfo too, and improve comment * dont check features unnecessarily * update LightSplitInfo too
-
- 28 Jun, 2021 1 commit
-
-
Robin Dong authored
-