Commits · 548cec829e232ac388154758b59ca696b2147a08 · tianlh / LightGBM-DCU

17 Oct, 2025 1 commit
- [ci][c++] fixed `build/header_guard` errors from cpplint (#7055) · c5c60c8f
  Nikita Titov authored Oct 17, 2025
  
  c5c60c8f
13 Oct, 2025 1 commit

[c++][ci] fix `include_what_you_use` errors from cpplint (#7052) · 9c0e457c

Nikita Titov authored Oct 13, 2025



* dev

* dev

* dev

* Update static_analysis.yml

* Update .pre-commit-config.yaml

---------
Co-authored-by: James Lamb <jaylamb20@gmail.com>

9c0e457c

23 Sep, 2025 1 commit

partial revert of · 20996c92

Jeff Daily authored Sep 23, 2025

Instead of replacing all #ifdef USE_CUDA, just add USE_CUDA define to ROCm build.

20996c92

22 Sep, 2025 1 commit

[ROCm] re-add support for ROCm builds · 61ec4f1a

Jeff Daily authored Sep 22, 2025

Previously #6086 added ROCm support but after numerous rebases it lost
critical changes. This PR restores the ROCm build.

There are many source file changes but most were automated using the
following:

```bash
for f in `grep -rl '#ifdef USE_CUDA'`
do
    sed -i 's@#ifdef USE_CUDA@#if defined(USE_CUDA) || defined(USE_ROCM)@g' $f
done

for f in `grep -rl '#endif  // USE_CUDA'`
do
    sed -i 's@#endif  // USE_CUDA@#endif  // USE_CUDA || USE_ROCM@g' $f
done
```

61ec4f1a

24 Aug, 2025 1 commit

[ci] [c++] use 'pre-commit' to run 'cpplint', upgrade to 'cpplint' 2.0.2 (#7002) · 0df10887

James Lamb authored Aug 24, 2025

* [ci] [c++] use 'pre-commit' to run 'cpplint', upgrade to 'cpplint' 2.0.2

* remove bashisms

* one more pipefail use

* another pipefail

0df10887

20 Jan, 2025 1 commit

[c++] Fixed Predictor lifecycle and trees initialization in Contrib mode (#6778) · 3654ecaa

AndreyOrb authored Jan 20, 2025



* 1) Fixed Predictor lifecycle
2) Fixed Boosting trees initialization

#5482

* Added tests for LGBM_BoosterPredictForMat in Contrib mode

* #6778 Reverted indentation to 4 spaces

---------
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

3654ecaa

01 Dec, 2024 1 commit
- [ci] Introduce `typos` pre-commit hook (#6564) · 784f3841
  Oliver Borchert authored Dec 01, 2024
```
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
```
  784f3841
13 Oct, 2024 1 commit
- [c++] Fix `dump_model()` information for root node (#6569) · bbeecc09
  Atanas Dimitrov authored Oct 13, 2024
```
Co-authored-by: Atanas Dimitrov <nasko119@abv.bg>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
```
  bbeecc09
02 Oct, 2024 1 commit

[c++] Add Bagging by Query for Lambdarank (#6623) · d1d218c3

shiyu1994 authored Oct 03, 2024



* add bagging by query for lambdarank

* fix pre-commit

* fix bagging by query with cuda

* fix bagging by query test case

* fix bagging by query test case

* fix bagging by query test case

* add #include <vector>

* Update include/LightGBM/objective_function.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_engine.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_engine.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

---------
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

d1d218c3

10 Jul, 2024 1 commit
- [c++] Avoid copy on Refit (#6478) · 1886bf51
  Christian Bourjau authored Jul 10, 2024
  
  1886bf51
01 May, 2024 1 commit

[python-package] Allow to pass early stopping min delta in params (#6274) · 9f5fbb64

Oliver Borchert authored May 01, 2024



* [python-package] Allow to pass early stopping min delta in params

* Fix test

* Add separate test

* Fix

* Add to cpp config

* Adjust test

* Adjust test

* Debug

* Revert

* Apply suggestions from code review

---------
Co-authored-by: James Lamb <jaylamb20@gmail.com>

9f5fbb64

20 Apr, 2024 1 commit
- [docs] document how to use a custom objective function via the C API (#6396) · 89824a6c
  Parsiad Azimzadeh authored Apr 20, 2024
  
  89824a6c
10 Oct, 2023 1 commit
- set explicit number of threads in every OpenMP `parallel` region (#6135) · 8ed371ce
  James Lamb authored Oct 09, 2023
  
  8ed371ce
08 Oct, 2023 1 commit

[CUDA] CUDA Quantized Training (fixes #5606) (#5933) · f901f471

shiyu1994 authored Oct 08, 2023

* add quantized training (first stage)

* add histogram construction functions for integer gradients

* add stochastic rounding

* update docs

* fix compilation errors by adding template instantiations

* update files for compilation

* fix compilation of gpu version

* initialize gradient discretizer before share states

* add a test case for quantized training

* add quantized training for data distributed training

* Delete origin.pred

* Delete ifelse.pred

* Delete LightGBM_model.txt

* remove useless changes

* fix lint error

* remove debug loggings

* fix mismatch of vector and allocator types

* remove changes in main.cpp

* fix bugs with uninitialized gradient discretizer

* initialize ordered gradients in gradient discretizer

* disable quantized training with gpu and cuda

fix msvc compilation errors and warnings

* fix bug in data parallel tree learner

* make quantized training test deterministic

* make quantized training in test case more accurate

* refactor test_quantized_training

* fix leaf splits initialization with quantized training

* check distributed quantized training result

* add cuda gradient discretizer

* add quantized training for CUDA version in tree learner

* remove cuda computability 6.1 and 6.2

* fix parts of gpu quantized training errors and warnings

* fix build-python.sh to install locally built version

* fix memory access bugs

* fix lint errors

* mark cuda quantized training on cuda with categorical features as unsupported

* rename cuda_utils.h to cuda_utils.hu

* enable quantized training with cuda

* fix cuda quantized training with sparse row data

* allow using global memory buffer in histogram construction with cuda quantized training

* recover build-python.sh

enlarge allowed package size to 100M

f901f471

07 Oct, 2023 1 commit
- ignore unknown parameters when loading from model file (#6126) · b793cd82
  José Morales authored Oct 06, 2023
  
  b793cd82
05 Sep, 2023 1 commit
- Fix updates in random forest model using GOSS data sample strategy (#6017) · 82033064
  mjmckp authored Sep 05, 2023
  
  82033064
30 Jun, 2023 1 commit
- move LightGBM-vendored json11 into a LightGBM-specific namespace (fixes #5944) (#5946) · 9f78ccee
  maskedcoder1337 authored Jun 30, 2023
  
  9f78ccee
01 Feb, 2023 1 commit

[CUDA] consolidate CUDA versions (#5677) · 4f47547c

James Lamb authored Jan 31, 2023



* [ci] speed up if-else, swig, and lint conda setup

* add 'source activate'

* python constraint

* start removing cuda v1

* comment out CI

* remove more references

* revert some unnecessaary changes

* revert a few more mistakes

* revert another change that ignored params

* sigh

* remove CUDATreeLearner

* fix tests, docs

* fix quoting in setup.py

* restore all CI

* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Apply suggestions from code review

* completely remove cuda_exp, update docs

---------
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

4f47547c

30 Dec, 2022 1 commit
- Check feature indexes in forced split file (fixes #5517) (#5653) · f84bfcf9
  Belinda Trotta authored Dec 30, 2022
  
  f84bfcf9
28 Dec, 2022 1 commit

Decouple Boosting Types (fixes #3128) (#4827) · fffd066c

Yifei Liu authored Dec 28, 2022



* add parameter data_sample_strategy

* abstract GOSS as a sample strategy(GOSS1), togetherwith origial GOSS (Normal Bagging has not been abstracted, so do NOT use it now)

* abstract Bagging as a subclass (BAGGING), but original Bagging members in GBDT are still kept

* fix some variables

* remove GOSS(as boost) and Bagging logic in GBDT

* rename GOSS1 to GOSS(as sample strategy)

* add warning about use GOSS as boosting_type

* a little ; bug

* remove CHECK when "gradients != nullptr"

* rename DataSampleStrategy to avoid confusion

* remove and add some ccomments, followingconvention

* fix bug about GBDT::ResetConfig (ObjectiveFunction inconsistencty bet…

* add std::ignore to avoid compiler warnings (anpotential fails)

* update Makevars and vcxproj

* handle constant hessian

move resize of gradient vectors out of sample strategy

* mark override for IsHessianChange

* fix lint errors

* rerun parameter_generator.py

* update config_auto.cpp

* delete redundant blank line

* update num_data_ when train_data_ is updated

set gradients and hessians when GOSS

* check bagging_freq is not zero

* reset config_ value

merge ResetBaggingConfig and ResetGOSS

* remove useless check

* add ttests in test_engine.py

* remove whitespace in blank line

* remove arguments verbose_eval and evals_result

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update src/boosting/sample_strategy.cpp

modify warning about setting goss as `boosting_type`
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

replace load_boston() with make_regression()

remove value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Update tests/python_package_test/test_engine.py

add value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Modify warnning about using goss as boosting type

* Update tests/python_package_test/test_engine.py

add random_state=42 for make_regression()

reduce the threshold of mean_square_error

* Update src/boosting/sample_strategy.cpp
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* remove goss from boosting types in documentation

* Update src/boosting/bagging.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/boosting/bagging.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/boosting/goss.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/boosting/goss.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* rename GOSS with GOSSStrategy

* update doc

* address comments

* fix table in doc

* Update include/LightGBM/config.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update documentation

* update test case

* revert useless change in test_engine.py

* add tests for evaluation results in test_sample_strategy_with_boosting

* include <string>

* change to assert_allclose in test_goss_boosting_and_strategy_equivalent

* more tolerance in result checking, due to minor difference in results of gpu versions

* change == to np.testing.assert_allclose

* fix test case

* set gpu_use_dp to true

* change --report to --report-level for rstcheck

* use gpu_use_dp=true in test_goss_boosting_and_strategy_equivalent

* revert unexpected changes of non-ascii characters

* revert unexpected changes of non-ascii characters

* remove useless changes

* allocate gradients_pointer_ and hessians_pointer when necessary

* add spaces

* remove redundant virtual

* include <LightGBM/utils/log.h> for USE_CUDA

* check for  in test_goss_boosting_and_strategy_equivalent

* check for identity in test_sample_strategy_with_boosting

* remove cuda  option in test_sample_strategy_with_boosting

* Update tests/python_package_test/test_engine.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_engine.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after bagging

* remove useless code

* check objective_function_ instead of gradients

* enable rf with goss

simplify params in test cases

* remove useless changes

* allow rf with feature subsampling alone

* change position of ResetGradientBuffers

* check for dask

* add parameter types for data_sample_strategy
Co-authored-by: Guangda Liu <v-guangdaliu@microsoft.com>
Co-authored-by: Yu Shi <shiyu_k1994@qq.com>
Co-authored-by: GuangdaLiu <90019144+GuangdaLiu@users.noreply.github.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

fffd066c

11 Oct, 2022 1 commit
- [python-package][R-package] load parameters from model file (fixes #2613) (#5424) · 8b720844
  José Morales authored Oct 11, 2022
  
  8b720844
06 Sep, 2022 1 commit
- fix references to 'object function' (#5468) · f0e32962
  James Lamb authored Sep 05, 2022
  
  f0e32962
02 Sep, 2022 1 commit

[CUDA] Add Huber regression objective for cuda_exp (#5462) · 45c53f78

shiyu1994 authored Sep 02, 2022

* add huber regression for cuda_exp

* renew tree output on GPU

add test cases for regression objectives

* remove useless changes

* add white space

* fix test_regression

45c53f78

31 Aug, 2022 1 commit

[CUDA] Add binary objective for cuda_exp (#5425) · 2b8fe8b4

shiyu1994 authored Aug 31, 2022

* add binary objective for cuda_exp

* include <string> and <vector>

* exchange include ordering

* fix length of score to copy in evaluation

* fix EvalOneMetric

* fix cuda binary objective and prediction when boosting on gpu

* Add white space

* fix BoostFromScore for CUDABinaryLogloss

update log in test_register_logger

* include <algorithm>

* simplify shared memory buffer

2b8fe8b4

29 Aug, 2022 1 commit

[ci][fix] Fix cuda_exp ci (#5438) · be7f3213

shiyu1994 authored Aug 29, 2022



* fix cuda_exp ci

* fix ci failures introduced by #5279

* cleanup cuda.yml

* fix test.sh

* clean up test.sh

* clean up test.sh

* skip lines by cuda_exp in test_register_logger

* Update tests/python_package_test/test_utilities.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

be7f3213

20 Aug, 2022 1 commit
- [fix] change the destructor of ScoreUpdater to virtual (fixes #5400) (#5403) · 865c126a
  shiyu1994 authored Aug 20, 2022
```
change the destructor of ScoreUpdater to virtual
```
  865c126a
03 Aug, 2022 1 commit

Minor CUDA cleanup (#5394) · c7102e56

Nikita Titov authored Aug 03, 2022



* Update README.rst

* Update cuda_score_updater.cu
Co-authored-by: James Lamb <jaylamb20@gmail.com>

c7102e56

29 Jul, 2022 1 commit

[CUDA] Initial work for boosting and evaluation with CUDA (#5279) · e0af160a

shiyu1994 authored Jul 29, 2022

* initial work for boosting and evaluation with CUDA

* fix compatibility with CPU code

* fix creating objective without USE_CUDA_EXP

* fix static analysis errors

* fix static analysis errors

e0af160a

10 May, 2022 1 commit

Fix potential overflow "Multiplication result converted to larger type" (#5189) · 6de9bafa

Nikita Titov authored May 10, 2022



* Update dataset_loader.cpp

* Update gbdt.h

* Update regression_objective.hpp

* Update linker_topo.cpp

* Update xentropy_objective.hpp

* Update regression_objective.hpp

* investigate inf test failure

* avoid overflow in regression objective

* remove `test_inf_handle` test
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

6de9bafa

23 Mar, 2022 1 commit

[CUDA] New CUDA version Part 1 (#4630) · 6b56a90c

shiyu1994 authored Mar 23, 2022



* new cuda framework

* add histogram construction kernel

* before removing multi-gpu

* new cuda framework

* tree learner cuda kernels

* single tree framework ready

* single tree training framework

* remove comments

* boosting with cuda

* optimize for best split find

* data split

* move boosting into cuda

* parallel synchronize best split point

* merge split data kernels

* before code refactor

* use tasks instead of features as units for split finding

* refactor cuda best split finder

* fix configuration error with small leaves in data split

* skip histogram construction of too small leaf

* skip split finding of invalid leaves

stop when no leaf to split

* support row wise with CUDA

* copy data for split by column

* copy data from host to CPU by column for data partition

* add synchronize best splits for one leaf from multiple blocks

* partition dense row data

* fix sync best split from task blocks

* add support for sparse row wise for CUDA

* remove useless code

* add l2 regression objective

* sparse multi value bin enabled for CUDA

* fix cuda ranking objective

* support for number of items <= 2048 per query

* speedup histogram construction by interleaving global memory access

* split optimization

* add cuda tree predictor

* remove comma

* refactor objective and score updater

* before use struct

* use structure for split information

* use structure for leaf splits

* return CUDASplitInfo directly after finding best split

* split with CUDATree directly

* use cuda row data in cuda histogram constructor

* clean src/treelearner/cuda

* gather shared cuda device functions

* put shared CUDA functions into header file

* change smaller leaf from <= back to < for consistent result with CPU

* add tree predictor

* remove useless cuda_tree_predictor

* predict on CUDA with pipeline

* add global sort algorithms

* add global argsort for queries with many items in ranking tasks

* remove limitation of maximum number of items per query in ranking

* add cuda metrics

* fix CUDA AUC

* remove debug code

* add regression metrics

* remove useless file

* don't use mask in shuffle reduce

* add more regression objectives

* fix cuda mape loss

add cuda xentropy loss

* use template for different versions of BitonicArgSortDevice

* add multiclass metrics

* add ndcg metric

* fix cross entropy objectives and metrics

* fix cross entropy and ndcg metrics

* add support for customized objective in CUDA

* complete multiclass ova for CUDA

* separate cuda tree learner

* use shuffle based prefix sum

* clean up cuda_algorithms.hpp

* add copy subset on CUDA

* add bagging for CUDA

* clean up code

* copy gradients from host to device

* support bagging without using subset

* add support of bagging with subset for CUDAColumnData

* add support of bagging with subset for dense CUDARowData

* refactor copy sparse subrow

* use copy subset for column subset

* add reset train data and reset config for CUDA tree learner

add deconstructors for cuda tree learner

* add USE_CUDA ifdef to cuda tree learner files

* check that dataset doesn't contain CUDA tree learner

* remove printf debug information

* use full new cuda tree learner only when using single GPU

* disable all CUDA code when using CPU version

* recover main.cpp

* add cpp files for multi value bins

* update LightGBM.vcxproj

* update LightGBM.vcxproj

fix lint errors

* fix lint errors

* fix lint errors

* update Makevars

fix lint errors

* fix the case with 0 feature and 0 bin

fix split finding for invalid leaves

create cuda column data when loaded from bin file

* fix lint errors

hide GetRowWiseData when cuda is not used

* recover default device type to cpu

* fix na_as_missing case

fix cuda feature meta information

* fix UpdateDataIndexToLeafIndexKernel

* create CUDA trees when needed in CUDADataPartition::UpdateTrainScore

* add refit by tree for cuda tree learner

* fix test_refit in test_engine.py

* create set of large bin partitions in CUDARowData

* add histogram construction for columns with a large number of bins

* add find best split for categorical features on CUDA

* add bitvectors for categorical split

* cuda data partition split for categorical features

* fix split tree with categorical feature

* fix categorical feature splits

* refactor cuda_data_partition.cu with multi-level templates

* refactor CUDABestSplitFinder by grouping task information into struct

* pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder

* fix misuse of reference

* remove useless changes

* add support for path smoothing

* virtual destructor for LightGBM::Tree

* fix overlapped cat threshold in best split infos

* reset histogram pointers in data partition and spllit finder in ResetConfig

* comment useless parameter

* fix reverse case when na is missing and default bin is zero

* fix mfb_is_na and mfb_is_zero and is_single_feature_column

* remove debug log

* fix cat_l2 when one-hot

fix gradient copy when data subset is used

* switch shared histogram size according to CUDA version

* gpu_use_dp=true when cuda test

* revert modification in config.h

* fix setting of gpu_use_dp=true in .ci/test.sh

* fix linter errors

* fix linter error

remove useless change

* recover main.cpp

* separate cuda_exp and cuda

* fix ci bash scripts

add description for cuda_exp

* add USE_CUDA_EXP flag

* switch off USE_CUDA_EXP

* revert changes in python-packages

* more careful separation for USE_CUDA_EXP

* fix CUDARowData::DivideCUDAFeatureGroups

fix set fields for cuda metadata

* revert config.h

* fix test settings for cuda experimental version

* skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version

* fix lint issue by adding a blank line

* fix lint errors by resorting imports

* fix lint errors by resorting imports

* fix lint errors by resorting imports

* merge cuda.yml and cuda_exp.yml

* update python version in cuda.yml

* remove cuda_exp.yml

* remove unrelated changes

* fix compilation warnings

fix cuda exp ci task name

* recover task

* use multi-level template in histogram construction

check split only in debug mode

* ignore NVCC related lines in parameter_generator.py

* update job name for CUDA tests

* apply review suggestions

* Update .github/workflows/cuda.yml
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update .github/workflows/cuda.yml
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update header

* remove useless TODOs

* remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062

* #include <LightGBM/utils/log.h> for USE_CUDA_EXP only

* fix include order

* fix include order

* remove extra space

* address review comments

* add warning when cuda_exp is used together with deterministic

* add comment about gpu_use_dp in .ci/test.sh

* revert changing order of included headers
Co-authored-by: Yu Shi <shiyu1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

6b56a90c

09 Mar, 2022 1 commit

[fix] fix duplicate added initial scores for single-leaf trees (#fixes #4708) · f6d654b7

shiyu1994 authored Mar 09, 2022



* fix duplicate added initial scores for single-leaf trees

* add test case

* Fix import in Python test

* commit python suggestions
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

f6d654b7

16 Nov, 2021 1 commit

Add customized parser support (#4782) · b0137deb

chjinche authored Nov 16, 2021

* add customized parser support

* fix typo of parser_config_file description

* make delimiter as parameter of JoinedLines

b0137deb

10 May, 2021 1 commit
- [docs] remove extra spaces in comments and docs (#4269) · a8ee487a
  James Lamb authored May 10, 2021
  
  a8ee487a
04 May, 2021 2 commits

fix param name (#4253) · fcd24535
Nikita Titov authored May 05, 2021
```
* fix param name

* Update gpu_tree_learner.h

* Update gbdt.h
```
fcd24535

Correct spelling (#4250) · e79716e0

Andrew Ziem authored May 04, 2021



* Correct spelling

Most changes were in comments, and there were a few changes to literals for log output.

There were no changes to variable names, function names, IDs, or functionality.

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Correct spelling

Most are code comments, but one case is a literal in a logging message.

There are a few grammar fixes too.
Co-authored-by: James Lamb <jaylamb20@gmail.com>

e79716e0

07 Jan, 2021 1 commit
- support more filesystem as the storage for model file (#3730) · ed4d0ae3
  htgeis authored Jan 07, 2021
```
Co-authored-by: jingwei.su <jingwei.su@hulu.com>
```
  ed4d0ae3
24 Dec, 2020 1 commit

Trees with linear models at leaves (#3299) · fcfd4132

Belinda Trotta authored Dec 24, 2020

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs

fcfd4132

08 Dec, 2020 1 commit

Fix model locale issue and improve model R/W performance. (#3405) · 792c9303

Alberto Ferreira authored Dec 08, 2020

* Fix LightGBM models locale sensitivity and improve R/W performance.

When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    https://github.com/microsoft/LightGBM/issues/2979 with the previous
    approach https://github.com/microsoft/LightGBM/pull/2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - https://github.com/microsoft/LightGBM/issues/2500
 - https://github.com/microsoft/LightGBM/issues/2890
 - https://github.com/ninia/jep/issues/205

 (as it is related to LGBM as well)

* Align CommonC namespace

* Add new external_libs/ to python setup

* Try fast_double_parser fix #1

Testing commit e09e5aad828bcb16bea7ed0ed8322e019112fdbe

If it works it should fix more LGBM builds

* CMake: Attempt to link fmt without explicit PUBLIC tag

* Exclude external_libs from linting

* Add exernal_libs to MANIFEST.in

* Set dynamic linking option for fmt.

* linting issues

* Try to fix lint includes

* Try to pass fPIC with static fmt lib

* Try CMake P_I_C option with fmt library

* [R-package] Add CMake support for R and CRAN

* Cleanup CMakeLists

* Try fmt hack to remove stdout

* Switch to header-only mode

* Add PRIVATE argument to target_link_libraries

* use fmt in header-only mode

* Remove CMakeLists comment

* Change OpenMP to PUBLIC linking in Mac

* Update fmt submodule to 7.1.2

* Use fmt in header-only-mode

* Remove fmt from CMakeLists.txt

* Upgrade fast_double_parser to v0.2.0

* Revert "Add PRIVATE argument to target_link_libraries"

This reverts commit 3dd45dde7b92531b2530ab54522bb843c56227a7.

* Address James Lamb's comments

* Update R-package/.Rbuildignore
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Upgrade to fast_double_parser v0.3.0 - Solaris support

* Use legacy code only in Solaris

* Fix lint issues

* Fix comment

* Address StrikerRUS's comments (solaris ifdef).

* Change header guards
Co-authored-by: James Lamb <jaylamb20@gmail.com>

792c9303

21 Nov, 2020 1 commit

[R-package] Remove CLI-only objects (#3566) · 1c5930bf

James Lamb authored Nov 21, 2020



* [R-package] Remove CLI-only objects

* more guards

* more guards

* variable not string

* simplify fix

* revert build_r.R changes

* move define of global_timer
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

1c5930bf

23 Oct, 2020 1 commit
- remove std::move (#3478) · c07644d1
  Nikita Titov authored Oct 23, 2020
  
  c07644d1